-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
improve S3 v3 storage layer #8927
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -309,8 +309,16 @@ def on_after_init(self): | |||
modify_service_response.append(S3Provider.service, s3_presigned_url_response_handler) | |||
register_website_hosting_routes(router=ROUTER) | |||
|
|||
def accept_state_visitor(self, visitor: StateVisitor): | |||
visitor.visit(s3_stores) | |||
visitor.visit(AssetDirectory(self._storage_backend.root_directory)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: this won't work for community cloud pods, as there's a check that the path should start with config.dirs.data
, and files might not be saved here in the community version as they're spooled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
acceptable limitation for now 👍
bcce35f
to
626bc96
Compare
edfad3a
to
12b47d5
Compare
555135b
to
2c1a505
Compare
12b47d5
to
952e428
Compare
4dedad3
to
bc4a45b
Compare
bc4a45b
to
68e49ac
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice! glad we did the iterations and now have a clean distinction between reset/flush/close in the contract 👍
@@ -309,8 +309,16 @@ def on_after_init(self): | |||
modify_service_response.append(S3Provider.service, s3_presigned_url_response_handler) | |||
register_website_hosting_routes(router=ROUTER) | |||
|
|||
def accept_state_visitor(self, visitor: StateVisitor): | |||
visitor.visit(s3_stores) | |||
visitor.visit(AssetDirectory(self._storage_backend.root_directory)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
acceptable limitation for now 👍
def create_bucket(self, bucket: BucketName): | ||
pass | ||
|
||
def delete_bucket(self, bucket: BucketName): | ||
pass | ||
|
||
def flush(self): | ||
pass | ||
|
||
def close(self): | ||
pass | ||
|
||
def reset(self): | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe some doc strings would be helpful to explain what the different behavior of flush,close,reset are
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do 👍 funnily, I don't think we use flush anywhere at the moment, as persistence is always "on" for the filesystem storage backend. But we could keep it for some fun behaviour with community pods or something?
def __init__(self) -> None: | ||
super().__init__() | ||
self._storage_backend = EphemeralS3ObjectStore() | ||
self._storage_backend = EphemeralS3ObjectStore(DEFAULT_S3_TMP_DIR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i mentioned this before: maybe we could allow injecting a storage backend into the constructor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. But isn't having the logic of the storage backend inside providers.py
, depending on config
be a bit weird? Are there other services doing this?
Thanks for the quick chat on Slack, I will make it optional so that we can inject it from pro and have the same default value. Awesome 👍
Motivation
Improving the current S3 storage layer, introduce 2 more operations to allow easier cleanup of resources with bucket creation and deletion. Also separate between
close
andreset
the store, to have better semantics between these 2 operations (closing does not always mean deleting all underlying objects, especially with persistence enabled).Separate between the
close
call for the Ephemeral object, so that it doesn't close the underlying file object, which would delete it in Ephemeral mode, following the fix to the response with #8926.Changes
Introduce said changes above,
create_bucket
anddelete_bucket
for the storage layer, and preparing support for persistence.Also added a quick fix regarding creating bucket in
us-east-1
, I wasn't returning early from the method and ended up overwriting the bucket in the store.Make
close
a noop forEphemeralS3StoredObject
, and implement adelete
method for cleaning up.