Move core_functions to a separate extension#14149
Merged
Mytherin merged 50 commits intoduckdb:featurefrom Sep 28, 2024
Merged
Conversation
carlopi
approved these changes
Sep 27, 2024
Contributor
carlopi
left a comment
There was a problem hiding this comment.
This looks very promising, thanks!!!
…meter names/example
…n function placeholders when generating extension entries
Contributor
|
I just discovered this, the route to generate |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR moves the
core_functionsto a separate extension, instead of sitting inline in the main source tree. Thecore_functionslibrary is statically shipped with (almost) all clients, similar to how theparquetclient is shipped. The main idea here is that for the WASM build we can leave out thecore_functionsof the main binary distribution - and load it only later when one of the functions is referenced. This decreases the size of the initial WASM payload.For testing purposes, the
core_functionsextension is always loaded currently, as many tests rely on core functions being present and adding the correctrequiresand fixing all of the C++ tests is a lot of work. This should be done at some point - in particular to test the auto-loading fully.Moving Functions To Main
As part of this PR, we must make sure that there are no references to
core_functionsfrom within the main library. This PR accomplishes that by moving a number of functions out ofcore_functionsand into the main library. At some point we should revisit this - as likely several functions can be moved back out of the main library and intocore_functionswith more careful planning.Auto-Loading Function Overloads
One side-effect of moving
core_functionsto a separate extension is that there are now functions that are defined in multiple extensions. For example, theyearfunction is defined in bothicuandcore_functions, but with different overloads. Our auto-loading functionality is currently based on auto-loading entire functions - so that if a function is not found in the catalog, the extension is auto-loaded. With different overloads in different extensions this becomes ambiguous (which one do you auto-load?). In addition, the auto-loading causes weird behavior if only one of the two extensions are loaded - since the correct overload might not be present.This PR solves this by extending the auto-loading to function overloads. This is accomplished by adding an
EXTENSION_FUNCTION_OVERLOADS- which includes specific overloads for functions. On start-up, the extension overloads are registered using a dummy function. In the bind of this dummy function the extension auto-load is attempted - and if successful, the function is loaded from the extension. This also fixes a long-standing issue around auto-loading of ICU that was mainly present in WASM, as ICU introduces a lot of overloads for existing functions which would then result in a binding error when used without ICU, instead of resulting in ICU itself being loaded.The
EXTENSION_FUNCTION_OVERLOADSare automatically generated similar to the functions. If the same function exists in multiple extensions - it is added toEXTENSION_FUNCTION_OVERLOADSinstead of toEXTENSION_FUNCTIONSwith the corresponding overloads.