-
Couldn't load subscription status.
- Fork 2.7k
Wrap http_util into a http_util_container object, so that in-use HTTPUtil can be reached from FileOpener #18103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…Util can be reached from FileOpener Without this PR, there are situations where `httpfs` extension defaults to own HTTPFSUtil, instead of using the global registered one. I think while this currently work, it's somewhat of a code smell, and basically prevent other extensions (or clients, such as duckdb-wasm) to have a clean path to override the default http_util to be used. This is possibly also a problem connected to introduction of curl in the `httpfs` extension, since it's not possible for `httpfs` to offer two interfaces, and register one or the other depending on settings. Solution is adding a layer of indirection, where the `http_util` shared_ptr is moved from DBConfig to an inner class HTTPUtilContainer. DBConfig keeps then a shared_ptr to HTTPUtilContainer, that keeps a shared_ptr to HTTPUtil currently in use. HTTPUtil keeps a weak_ptr back to the HTTPUtilContainer, since this makes so that having a HTTPUtil object in hand allows to walk back to the default HTTPUtil (and possibly setting it to a different value). HTTPUtilContainer is also make compulsory for constructing a FileOpener, so that with a FileOpener it's always possible to get the currently in-use HTTPUtil class. There are some layers of pointers, but this allows to remove a current limitation that is that HTTPFSUtil can be the only one to override HTTPUtil (and this needs to be 1 way only). Main question is whether this is feasible. Secondary questions are: * are there other objects that should be moved from DBConfig to HTTPUtilContainer? * how HTTPUtilContainer should be actually named? GlobalFileSystemProperties?
|
Discussed on a side with @Mytherin, another approach would be adding a less invasive: to FileOpener class. I will have a look at that and then update. |
carlopi
added a commit
to carlopi/duckdb
that referenced
this pull request
Jul 1, 2025
Without this PR, there are situations where `httpfs` extension defaults to own HTTPFSUtil, instead of using the global registered one. I think while this currently work, it's somewhat of a code smell, and basically prevent other extensions (or clients, such as duckdb-wasm) to have a clean path to override the default http_util to be used. This is possibly also a problem connected to introduction of curl in the `httpfs` extension, since it's not possible for `httpfs` to offer two interfaces, and register one or the other depending on settings. This commit forces FileOpener to have a way to return a reference to the original shared_ptr<HTTPUtil> (possibly caching on costructor / other ways). Then the main improvement is the chagnge (currently via a patch) that allows in duckdb-httpfs. This is an improved / slimmer version of duckdb#18103, thanks to feedback from @Mytherin.
carlopi
added a commit
to carlopi/duckdb
that referenced
this pull request
Jul 1, 2025
Without this PR, there are situations where `httpfs` extension defaults to own HTTPFSUtil, instead of using the global registered one. I think while this currently work, it's somewhat of a code smell, and basically prevent other extensions (or clients, such as duckdb-wasm) to have a clean path to override the default http_util to be used. This is possibly also a problem connected to introduction of curl in the `httpfs` extension, since it's not possible for `httpfs` to offer two interfaces, and register one or the other depending on settings. This commit forces FileOpener to have a way to return a reference to the original shared_ptr<HTTPUtil> (possibly caching on costructor / other ways). Then the main improvement is the chagnge (currently via a patch) that allows in duckdb-httpfs. This is an improved / slimmer version of duckdb#18103, thanks to feedback from @Mytherin.
Mytherin
added a commit
that referenced
this pull request
Jul 1, 2025
Without this PR, there are situations where `httpfs` extension defaults to own HTTPFSUtil, instead of using the global registered one. I think while this currently work, it's somewhat of a code smell, and basically prevent other extensions (or clients, such as duckdb-wasm) to have a clean path to override the default http_util to be used. This is possibly also a problem connected to introduction of curl in the `httpfs` extension, since it's not possible for `httpfs` to offer two interfaces, and register one or the other depending on settings. This commit forces FileOpener to have a way to return a reference to the original shared_ptr<HTTPUtil> (possibly caching on costructor / other ways). Then the main improvement is the chagnge (currently via a patch) that allows in duckdb-httpfs. This is a less clunky / slimmer version of #18103, thanks to feedback from @Mytherin.
|
Superseeded by #18107 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Without this PR, there are situations where
httpfsextension defaults to own HTTPFSUtil, instead of using the global registered one. I think while this currently work, it's somewhat of a code smell, and basically prevent other extensions (or clients, such as duckdb-wasm) to have a clean path to override the default http_util to be used.This is possibly also a problem connected to introduction of curl in the
httpfsextension, since it's not possible forhttpfsto offer two interfaces, and register one or the other depending on settings.Solution is adding a layer of indirection, where the
http_utilshared_ptr is moved from DBConfig to an inner class HTTPUtilContainer. DBConfig keeps then a shared_ptr to HTTPUtilContainer, that keeps a shared_ptr to HTTPUtil currently in use. HTTPUtil keeps a weak_ptr back to the HTTPUtilContainer, since this makes so that having a HTTPUtil object in hand allows to walk back to the default HTTPUtil (and possibly setting it to a different value).HTTPUtilContainer is also make compulsory for constructing a FileOpener, so that with a FileOpener it's always possible to get the currently in-use HTTPUtil class.
There are some layers of pointers, but this allows to remove a current limitation that is that HTTPFSUtil can be the only one to override HTTPUtil (and this needs to be 1 way only).
Main question is whether this is feasible.
Secondary questions are:
Main relevant change is the patch to HTTPFS, that goes from:
to
throw is there for testing, but haven't managed to trigger. I would think the opener is always present, but to limit damages if this is not correct would be to add a logging line + add back
return make_shared_ptr<HTTPFSUtil>();, so that worst case same behavior is kept.