-
Couldn't load subscription status.
- Fork 290
Description
What happened:
Using the latest version of s3fs, we get the following error when attempting to open a remote file:
Traceback (most recent call last):
File "/home/ray/anaconda3/lib/python3.7/site-packages/s3fs/core.py", line 233, in _call_s3
out = await method(**additional_kwargs)
File "/home/ray/anaconda3/lib/python3.7/site-packages/aiobotocore/client.py", line 154, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (PreconditionFailed) when calling the GetObject operation: At least one of the pre-conditions you specified did not hold
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/ray/anaconda3/lib/python3.7/site-packages/s3fs/core.py", line 1834, in _fetch_range
req_kw=self.req_kw,
File "/home/ray/anaconda3/lib/python3.7/site-packages/s3fs/core.py", line 1975, in _fetch_range
**req_kw,
File "/home/ray/anaconda3/lib/python3.7/site-packages/fsspec/asyn.py", line 72, in wrapper
return sync(self.loop, func, *args, **kwargs)
File "/home/ray/anaconda3/lib/python3.7/site-packages/fsspec/asyn.py", line 53, in sync
raise result[0]
File "/home/ray/anaconda3/lib/python3.7/site-packages/fsspec/asyn.py", line 20, in _runner
result[0] = await coro
File "/home/ray/anaconda3/lib/python3.7/site-packages/s3fs/core.py", line 252, in _call_s3
raise translate_boto_error(err)
OSError: [Errno 22] At least one of the pre-conditions you specified did not hold
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
<redacted>
File "/home/ray/anaconda3/lib/python3.7/site-packages/ludwig/utils/data_utils.py", line 109, in read_xsv
dialect = csv.Sniffer().sniff(csvfile.read(1024 * 100),
File "/home/ray/anaconda3/lib/python3.7/site-packages/fsspec/spec.py", line 1449, in read
out = self.cache._fetch(self.loc, self.loc + length)
File "/home/ray/anaconda3/lib/python3.7/site-packages/fsspec/caching.py", line 376, in _fetch
self.cache = self.fetcher(start, bend)
File "/home/ray/anaconda3/lib/python3.7/site-packages/s3fs/core.py", line 1841, in _fetch_range
) from ex
s3fs.utils.FileExpired: [Errno 16] The remote file corresponding to filename <redacted> and Etag "<redacted>" no longer exists.
The file is being opened through the fsspec entrypoint:
of = fsspec.open(url, **storage_options)
with of as f:
...
The twist here is that the URL is given as s3, but the file is actually stored in Azure Blob Storage. We're using the MinIO Azure Gateway to expose an S3-compatible layer over Azure Blob Storage.
Storage options are fairly straightforward for MinIO:
storage_options = {
'endpointUrl': 'http://localhost:9000',
'awsAccessKeyId': '...',
'awsSecretAccessKey': '...'
}
What you expected to happen:
Using s3fs==2021.4.0, this operation works fine.
Minimal Complete Verifiable Example:
Setup MinIO and then provide an s3 path to a bucket/object in MinIO:
url = 's3://bucket/object'
of = fsspec.open(url, **storage_options)
with of as f:
print(f.read())Anything else we need to know?:
Environment:
- Dask version: 2021.5.0
- Python version: 3.7.7
- Operating System: Linux (Ubuntu)
- Install method (conda, pip, source): pip