-
Notifications
You must be signed in to change notification settings - Fork 83
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
Request
Input and output directories that the CLP package operates on are mounted into the container as follows:
- The directory set via the
logs_input.directoryconfig gets mounted to/mnt/logs/<logs_input.directory> - The directory set via the
archive_output.storage.directoryconfig gets mounted to/mnt/archive-output - The directory set via the
stream_output.storage.directoryconfig gets mounted to/mnt/stream-output - The directory set via the
logs_directoryconfig gets mounted to/opt/clp/var/log - The directory set via the
data_directoryconfig gets mounted to/opt/clp/var/data
Although this provides some level of obfuscation of host paths, it presents some challenges for users:
- Paths printed to the console are relative to the container mount rather than the host filesystem (see On compression failure, the path for logs is a container-internal path rather than a package-relative path. #602).
- External users (e.g., Presto) of the datasets table (see feat(clp-json): Use dataset-specific tables and archive directories for compression, decompression, and search. #868) will see a container-relative path (e.g., /mnt/archive-output/dataset1) and will then need to translate that to a path on the filesystem.
- This problem is exacerbated if the external user is also running in its own container.
So the proposal is to simply mount the host filesystem paths specified by these configs directly into the container, without remapping them to a different path. Obviously, we need to ensure that the paths specified in the config file can be mapped into the container without causing issues, but that should be doable since we control what container the package runs in.
Possible implementation
- Validate that the paths specified in the config file don't overlap with paths that exist in the container (e.g., ensure they don't overlap with
/opt).- The one exception to this rule would be
logs_input.directorysince it's very likely that a user might configure/var/logor/tmpas their input directory and we may not be able to mount this into the container. For this case, I think we can handle continue to mount it into the container as we do currently and document the behaviour---since we retain the full input path, but just prefix it, the user shouldn't get too confused. In the future, we could potentially ensure that when we print out the path, we remove the/mnt/logs-inputprefix.
- The one exception to this rule would be
- Mount the configured paths directly into the container without remapping them.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request