Thanks to visit codestin.com
Credit goes to github.com

Skip to content
This repository was archived by the owner on Jan 7, 2025. It is now read-only.

Conversation

@taltmans
Copy link

  • Added functionality for integration of DIGITS with S3 Endpoints as well as accompanying unit tests.

Copy link
Contributor

@gheinrich gheinrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very good thanks! I have restarted the Torch CI job as it could have been a transient failure. Can you squash your commits and see if you can fix the Lint errors on https://travis-ci.org/NVIDIA/DIGITS/jobs/295123107

Thanks!

)

def from_s3(job, form):
print('from_s3 in progress...')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you remove this print statement?

args.append('--compression=%s' % self.compression)
if self.backend == 'hdf5':
args.append('--hdf5_dset_limit=%d' % 2**31)
if self.delete_files is not None and self.delete_files is True:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just if self.delete_files

@taltmans
Copy link
Author

taltmans commented Nov 2, 2017

Hello, we made the changes you requested and fixed the lint warnings in the S3 integration files. Did you determine whether the CI job failure was transient?

@gheinrich
Copy link
Contributor

Thank you @taltmans the CI failure was indeed transient. Good job on fixing the Lint errors, you got a pass on your last commit! I'll review the changes again and get back to you, thanks!

@TimZaman
Copy link
Contributor

TimZaman commented Nov 3, 2017

Don't forget to squash all commits. Also, it would be great if the S3 readme can be extended a bit more.

@@ -0,0 +1,20 @@
# S3 Integration - Installing Boto
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not add boto to requirements.txt file? Sounds easier than doing the installation from source, right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, we made that modification and removed this S3Installation.md file entirely.


print('host: ' + self.host)
print('is secure: ' + str(self.is_secure))
print('port: ' + str(self.port))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These prints could be replaced with calls to a logger as in https://github.com/NVIDIA/DIGITS/blob/master/digits/tools/parse_folder.py#L462.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We removed these and made a couple other modifications to suppress printing to stdout

#

s3_endpoint = utils.forms.StringField(
u'Training Images',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be clearer if you name this "S3 endpoint URL".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

if len(keys) >= max_size:
break

print('retrieved ' + str(len(keys)) + ' keys from ' + keys[0] + ' to ' + keys[-1])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this raised an exception in my case IndexError: list index out of range. I am unfamiliar with Boto/S3 so not sure what this means. The folder I pointed to was not empty.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The S3 Walker class filters the keys out based on prefix. Most likely, the folder you had pointed to did not have any keys with the relevant prefix so the keys list was empty at that point, leading to the exception. We added logic to avoid this line if the keys list is empty and added some instructions to examples/s3/README.md to explain how to set up the S3 Endpoint to prevent this from occurring altogether!

print('making list bucket with prefix...')
keys = walker.listbucket(bucket, path, with_prefix=True)

print('making list bucket without prefix...')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain why you're doing this with and without prefix?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We reviewed this code and removed it as it was no longer in use. From a S3 perspective, the "with_prefix" argument determines whether the "listbucket" method will return the names of the keys with the prefix or not (i.e. mnist/train/9/59942.png versus 59942.png).

## Introduction ##
Boto is a Python library that is required in order for DIGITS to interact with S3. This is not required if DIGITS is being trained on local files but is required for retrieving files from any S3 endpoint.

## Installation ##
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add an example walk-through that shows how to populate an S3 bucket with the expected contents and then how to load the data in DIGITS? It would be useful for laymen like me who are not so familiar with all of this :-)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We added some instructions to examples/s3/README.md and an accompanying link in the base README.md. Please let us know if any of it could use further detail.

@taltmans taltmans force-pushed the s3integration branch 6 times, most recently from 87ab35a to af3b46a Compare November 9, 2017 00:51
@taltmans
Copy link
Author

taltmans commented Nov 9, 2017

I squashed the commits and replied to each of your comments above. Please let us know if you have any more feedback.

Copy link
Contributor

@gheinrich gheinrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello thanks for the updates I uploaded MNIST to AWS S3 and tried to create a dataset but got this error:

2017-11-13 15:29:22 [20171113-152920-ad91] [WARNING] Parse Folder (train/val) unrecognized output: File "/usr/lib/python2.7/logging/__init__.py", line 861, in emit
2017-11-13 15:29:22 [20171113-152920-ad91] [WARNING] Parse Folder (train/val) unrecognized output: msg = self.format(record)
2017-11-13 15:29:22 [20171113-152920-ad91] [WARNING] Parse Folder (train/val) unrecognized output: File "/usr/lib/python2.7/logging/__init__.py", line 734, in format
2017-11-13 15:29:22 [20171113-152920-ad91] [WARNING] Parse Folder (train/val) unrecognized output: return fmt.format(record)
2017-11-13 15:29:22 [20171113-152920-ad91] [WARNING] Parse Folder (train/val) unrecognized output: File "/usr/lib/python2.7/logging/__init__.py", line 469, in format
2017-11-13 15:29:22 [20171113-152920-ad91] [WARNING] Parse Folder (train/val) unrecognized output: s = self._fmt % record.__dict__
2017-11-13 15:29:22 [20171113-152920-ad91] [WARNING] Parse Folder (train/val) unrecognized output: KeyError: 'job_id'
2017-11-13 15:29:22 [20171113-152920-ad91] [WARNING] Parse Folder (train/val) unrecognized output: Logged from file s3_walker.py, line 30

Any idea? Thanks!

Once that file has been configured appropriately, it may be run using:

```sh
python upload_mnist.py ~/mnist
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you mean upload_s3_data.py here?

@taltmans
Copy link
Author

Good catch on the README item. The logging warnings you sent turned out to be an import issue, which we fixed. Despite the warnings, the dataset creation job should have completed properly, did it complete on your end?

@gheinrich
Copy link
Contributor

Yes thanks I was able to create the dataset and train a model. I think this is good to go. Can you squash your commits?

@taltmans
Copy link
Author

I just squashed the commits, we're good to go on our end.

@gheinrich gheinrich merged commit ab2048d into NVIDIA:master Nov 14, 2017
@gheinrich
Copy link
Contributor

Thanks for a great feature!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants