Don't remap host filesystem paths when mounting them in CLP's execution containers

### Request

Input and output directories that the CLP package operates on are mounted into the container as follows:

* The directory set via the `logs_input.directory` config gets mounted to `/mnt/logs/<logs_input.directory>`
* The directory set via the `archive_output.storage.directory` config gets mounted to `/mnt/archive-output`
* The directory set via the `stream_output.storage.directory` config gets mounted to `/mnt/stream-output`
* The directory set via the `logs_directory` config gets mounted to `/opt/clp/var/log`
* The directory set via the `data_directory` config gets mounted to `/opt/clp/var/data`

Although this provides some level of obfuscation of host paths, it presents some challenges for users:

* Paths printed to the console are relative to the container mount rather than the host filesystem (see #602).
* External users (e.g., Presto) of the datasets table (see #868) will see a container-relative path (e.g., /mnt/archive-output/dataset1) and will then need to translate that to a path on the filesystem.
  * This problem is exacerbated if the external user is also running in its own container.

So the proposal is to simply mount the host filesystem paths specified by these configs directly into the container, without remapping them to a different path. Obviously, we need to ensure that the paths specified in the config file *can* be mapped into the container without causing issues, but that should be doable since we control what container the package runs in.

### Possible implementation

* Validate that the paths specified in the config file don't overlap with paths that exist in the container (e.g., ensure they don't overlap with `/opt`).
  * The one exception to this rule would be `logs_input.directory` since it's very likely that a user might configure `/var/log` or `/tmp` as their input directory and we may not be able to mount this into the container. For this case, I think we can handle continue to mount it into the container as we do currently and document the behaviour---since we retain the full input path, but just prefix it, the user shouldn't get too confused. In the future, we could potentially ensure that when we print out the path, we remove the `/mnt/logs-input` prefix.
* Mount the configured paths directly into the container without remapping them.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Don't remap host filesystem paths when mounting them in CLP's execution containers #960

Request

Possible implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Don't remap host filesystem paths when mounting them in CLP's execution containers #960

Description

Request

Possible implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions