-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
introduce new S3 native provider #8786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This already looks amazing @bentsku , well done! The separation into the StorageBackend is really good and the extra flexibility we gain will help a lot moving forward with issues around storage, performance, and persistence!
It's a big PR, but it already looks very good! Obviously lots of TODOs and blanks still in there, but that's fine at this stage. I'm also not going to nitpick much in the code. I don't see a lot of things that could really be improved, and I think it would be good to get a first version out ASAP.
Just some high level thoughts we could discuss:
The KeyStore
/ StorageBackend
duality is interesting to me, and I'm noticing that the API needs to keep the two in sync, which could potentially be a source of bugs when not done correctly. I don't know whether there's any value in trying to combine the two concepts. In general, the separation of Metadata and Data management makes a lot of sense to me though.
It could make sense to add more higher-level constructs into the storage backend to push more functionality that the API currently does into the storage provider, basically hiding file objects all together.
425c5ab
to
5e26d7d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This iteration has made the code so much cleaner, thanks for bearing with me @bentsku and considering the changes!
The additional S3ObjectStore
layer has made the Provider implementation much easier to understand, and isolation is much better 👌
My comments are mostly around adding more doc strings, and some minor tweaks. Some of the things can be tackled as follow-up PRs.
Really excellent work! 💯 🚀
localstack/services/s3/exceptions.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
q: is there any reason they are not generated in the API yet? i suppose historically they weren't added through spec patches?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically the idea was that if the exception needed something specific, like extra fields or specific default status code, we would generate it. Otherwise, we would follow the usual path of subclassing CommonServiceException
, like many other services. I just pulled them out of the previous provider so that I can use in both of them, but in the future once we only have one again, we could pulled them again inside the provider file? Is that a good idea?
if isinstance(found_object, S3Object): | ||
self._storage_backend.remove(bucket, found_object) | ||
|
||
# TODO: request charged |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this comment still about batching delete?
localstack/services/s3/v3/storage.py
Outdated
Create a temporary directory representing a bucket | ||
:param bucket_name | ||
""" | ||
tmp_dir = mkdtemp() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's try and restructure the class so we can inject a root_directory
path as dependency into the constructor, that is then used for all filesystem operations. this will help massively with persistence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean in the base class? Because this one is not really supposed to be used with persistence? but for testing, sure, it makes a lot of sense
5e26d7d
to
8098b88
Compare
8098b88
to
c9bce20
Compare
First PR to implement the new S3 native provider not depending on moto. This implements all the basic abstractions, as well as the basic operations around buckets and objects.
There are a lot of comments left, as it's still WIP, and this introduces only basic behaviour. We will need to implement all the bucket operations to be able to get the complete behaviour from basic operations.
List of implemented operations:
Abstractions
This PR implements new abstractions around S3. We are splitting between "data", aka the buckets, objects and metadata around them, and the actual file object containing S3 Objects real content. This will allow us to properly split how we save these for persistence in the next iteration.
We're introducing a new S3
BaseStorageBackend
andTemporaryStorageBackend
, which abstract away how those S3 file objects, which we will name "assets", are stored and retrieved. This will in turn allow us to either have a temporary filesystem, or the real filesystem in case of persistence, and give us greater flexibility.The last core Operations and Models outside of Object and Bucket configuration are the multipart, coming in #8787
There a lot of bugs regarding the versioning code path, because it couldn't be tested before I could enable versioning on the bucket, which is now done and is fixed with #8799.