Thanks to visit codestin.com
Credit goes to github.com

Skip to content

add new localstack filesystem hierarchy #6302

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Jun 28, 2022
Merged

add new localstack filesystem hierarchy #6302

merged 21 commits into from
Jun 28, 2022

Conversation

thrau
Copy link
Member

@thrau thrau commented Jun 19, 2022

This PR finalizes the changes to the filesystem hierarchy we started in #5011 and introduces a semantically well-defined filesystem hierarchy for the localstack system. It cleanly separates directories created on the host (for CLI users or docker users mounting directories into the container), from the filesystem hierarchy in localstack. For host mode, we now use ./.filesystem as filesystem root. The changes can be disabled by setting LEGACY_DIRECTORIES=1.

LocalStack filesystem hierarchy

The hierarchy inside the container will be as follows:

  • /var/lib/localstack: the main localstack volume that stores data across localstack runs, this is what user's are interested in

    • /var/lib/localstack/lib: variable packages (like lazy-loaded third-party dependencies, also referred to as var_libs)
    • /var/lib/localstack/logs: logs of the recent localstack run
    • /var/lib/localstack/state: persistent state of third-party services (or what we sometimes call "data" or "assets", e.g., opensearch cluster data)
    • /var/lib/localstack/tmp: temporary data that does not survive localstack runs (is cleared when localstack starts)
    • /var/lib/localstack/cache: temporary data that survives localstack runs (is not cleared when localstack starts)
  • /usr/lib/localstack: static third-party packages installed into the container images. this was previously "localstack/infra", and is also referred to as static_libs

  • /etc/localstack: 🚧 localstack configuration dir (for future use, currently not in use)

    • /etc/localstack/conf.d: localstack configuration overwrites
    • /etc/localstack/init/: localstack initialization hooks

User experience

From a user perspective, it should be similar to the mariadb and other database containers, where you mount a local directory into /var/lib/localstack, which persist all the data required across localstack runs.
Currently this is achieved by this line:

volumes:
- "${TMPDIR:-/tmp}/localstack:/tmp/localstack"

Users only need to change that to /my/local/localstack/volume:/var/lib/localstack, where the host path could still be /tmp/localstack.

DATA_DIR and HOST_TMP_FOLDER are no longer used:

  • DATA_DIR points implicitly to /my/local/localstack/volume/state. if you want to use persistence, set PERSISTENCE=1. if you set DATA_DIR, then we have a deprecation path that we log a warning that DATA_DIR is ignored, but PERSISTENCE is set to 1
  • HOST_TMP_FOLDER is determined by inspecting the volume mounts and using the source of the bind mount to /var/lib/localstack.

what they will see is something like this:

thomas@ninox /tmp/localstack % tree -L 3
.
├── cache
│   ├── machine.json
│   └── localstack.opensearch-1.1.0-linux-x64.tar.gz
├── lib
│   └── opensearch
│       └── 1.1.0
├── logs
│   ├── localstack_infra.err
│   └── localstack_infra.log
├── state
└── tmp
    └── zipfile.4986fb95 # <- lambda code

CLI users

Previously, each path was individually configurable (like DATA_DIR, VAR_LIBS, STATIC_LIBS, etc..). this is no longer supported or needed. the only variable that needs to be set is LOCALSTACK_VOLUME_DIR, which is essentially turns into the docker flag -v ${LOCALSTACK_VOLUME_DIR}:/var/lib/localstack. by default, LOCALSTACK_VOLUME_DIR=/tmp/localstack, but that could change to ~/.cache/localstack/volume (~/.cache/localstack is also associated with the CLI)

Developer experience (host mode)

when running in host mode, we don't want to use /usr/lib/localstack or /var/lib/localstack, etc on the host, since those are associated with the system and root permissions. instead, we simply move the filesystem hierarchy root to the environment variable FILESYSTEM_ROOT, which is by default ./.filesystem (from the localstack project root).
the hierarchy will look something like this (from the localstack project root):

thomas@ninox ~/workspace/localstack/localstack
 % tree .filesystem -L 4
.filesystem
├── etc
│   └── localstack
│       ├── conf.d
│       └── init
├── usr
│   └── lib
│       └── localstack
│           ├── amazon-kinesis-client
│           ├── aws-lambda-rie
│           ├── dynamodb
│           ├── kinesis-mock
│           ├── kms
│           ├── lambda.cfn-response.js
│           ├── localstack-utils-fat.jar
│           └── stepfunctions
└── var
    └── lib
        └── localstack
            ├── cache
            ├── state
            ├── lib
            ├── logs
            └── tmp

Developer experience (building)

When building the docker image, we need to do some installation prep of third-party libraries on the host before invoking docker build.
We piggy-back on the host-mode .filesystem directory, and simply ADD .filesystem/usr/lib/localstack /usr/lib/localstack.
This is not great IMO because it mixes concerns (host mode runtime with build time dependency management), but this is also they way it works right now.

PR Changes

Rundown of major changes:

  • the main volume target user is no longer /tmp/localstack but /var/lib/localstack, so /tmp/localstack:/tmp/localstack volume mounts need to change to /tmp/localstack:/var/lib/localstack
  • config.dirs.tmp is now cleared on every startup and shutdown
  • static libs are now /usr/lib/localstack (./.filesystem/usr/lib/localstack in host mode or when running make init)
  • var libs are now /var/lib/localstack/libs
  • the config.dirs.function dir is no longer needed. it was conceived at the time as a directory in the container where lambda code goes (instead of tmp), but was later used as config variable to hold the HOST_TMP_PATH, which is not needed (since we are inspecting the container now to find the value)
  • introduce config variable LEGACY_DIRECTORIES that can be activated as feature flag to enable to old mode
  • using the CLI to start localstack no longer creates the directories on the host. directories are only created when actually running localstack, or when we're in the container.
  • changed download destination to config.dirs.cache so downloaded archives are kept in case installation was incomplete

Rundown of other changes

  • remove explicit elasticmq download from Dockerfile into static_libs, and move it into var_libs (as part of phase-out)
  • create soft link from /opt/code/localstack/localstack/infra -> /usr/lib/localstack for backwards compatibility
  • move supervisord logs from /tmp/localstack_infra.log to /var/lib/localstack/logs, but create symlinks to not break peoples code

Limitations and future work

  • The separation of CLI and runtime configuration is still not clear enough for me. No directory configuration should be shared between the CLI and the runtime IMO.
  • There are lots of instances of the form (conceptually) data_store = config.dirs.data or config.dirs.tmp, these should slowly be migrated to simply use config.dirs.data (which will point to a temporary directory if PERSISTENCE=0), or use the config.PERSISTENCE flag if they want to set data_store (or whatever they are using) to None.
  • The package directory var_libs and static_libs still have no structure to them, I think we should introduce some well defined package index (like site-packages, node_modules, or maven's .m2)

@thrau thrau temporarily deployed to localstack-ext-tests June 19, 2022 23:28 Inactive
@github-actions
Copy link

github-actions bot commented Jun 19, 2022

LocalStack integration with Pro

0 tests   - 1 047   0 ✔️  - 972   0s ⏱️ - 1h 20m 11s
0 suites  -        1   0 💤  -   50 
0 files    -        1   0  -   24 

Results for commit e35e61a. ± Comparison against base commit d379176.

♻️ This comment has been updated with latest results.

@thrau thrau force-pushed the directories-rework branch from 470db1c to f19873e Compare June 20, 2022 14:05
@thrau thrau temporarily deployed to localstack-ext-tests June 20, 2022 14:05 Inactive
@thrau thrau temporarily deployed to localstack-ext-tests June 20, 2022 14:22 Inactive
@thrau thrau temporarily deployed to localstack-ext-tests June 20, 2022 14:39 Inactive
@thrau thrau force-pushed the directories-rework branch from 496da12 to 0700176 Compare June 20, 2022 17:44
@thrau thrau temporarily deployed to localstack-ext-tests June 20, 2022 17:44 Inactive
@thrau thrau force-pushed the directories-rework branch from 0700176 to 3e91e7b Compare June 20, 2022 22:12
@thrau thrau temporarily deployed to localstack-ext-tests June 20, 2022 22:12 Inactive
Copy link
Member

@alexrashed alexrashed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While there are still a few failing tests and some small open questions, the changes look great. It's nice to see that we are finally migrating to a properly defined directory structure! 🥳

Comment on lines +802 to +803
config.dirs.mkdirs()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above: Is this necessary? It seems to be done implicitly when loading the config.py for most cases.

@@ -59,6 +60,7 @@ def install(package, parallel):
"""
console.print(f"resolving packages: {package}")
installers: Dict[str, Callable] = InstallerManager().get_installers()
config.dirs.mkdirs()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary? It seems to be done implicitly when loading the config.py for most cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, but with install being a special case where we want the directory creation outside the runtime (although that's not clean behavior)

functions=f"{DEFAULT_VOLUME_DIR}/tmp", # FIXME: remove - this was misconceived
data=f"{DEFAULT_VOLUME_DIR}/data",
logs=f"{DEFAULT_VOLUME_DIR}/logs",
config="/etc/localstack/conf.d", # for future use
Copy link
Member

@alexrashed alexrashed Jun 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly will the future use of the config directory be? Currently, we use it to define config profiles which are only used by the CLI. With this change, the config directly in the container will be empty (since nothing will mount to /etc/localstack for now). I also think it might be a bit confusing having an unused config dir and the CONFIG_DIR env (where both of them would have different semantics).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair point. i think the confusion in general comes from the conflation of a) CLI config with b) runtime config in container mode, and c) runtime config in host mode. part of what this PR addresses is the clearer separation of those. CONFIG_DIR is purely a CLI option that holds configuration for the CLI (which in turn again can manipulate localstack config). however, there's a case for using custom configurations without the CLI.

@thrau thrau force-pushed the directories-rework branch from 3e91e7b to 602c478 Compare June 21, 2022 21:20
@thrau thrau temporarily deployed to localstack-ext-tests June 21, 2022 21:20 Inactive
@thrau thrau temporarily deployed to localstack-ext-tests June 21, 2022 22:39 Inactive
@thrau thrau force-pushed the directories-rework branch from c035895 to 3c1bcd8 Compare June 22, 2022 11:49
@thrau thrau temporarily deployed to localstack-ext-tests June 22, 2022 11:49 Inactive
@thrau thrau force-pushed the directories-rework branch from 3c1bcd8 to 7b44a96 Compare June 22, 2022 17:40
@thrau thrau temporarily deployed to localstack-ext-tests June 22, 2022 17:40 Inactive
@thrau thrau changed the title introduce /var/lib/localstack as main container volume add new localstack filesystem hierarchy Jun 22, 2022
@thrau thrau marked this pull request as ready for review June 22, 2022 18:11
@thrau thrau requested review from giograno and whummer June 22, 2022 18:11
@thrau
Copy link
Member Author

thrau commented Jun 22, 2022

looks like there's still a quirk in the pro tests. created an ext PR and test run here: https://github.com/localstack/localstack/actions/runs/2544647881

@thrau thrau temporarily deployed to localstack-ext-tests June 22, 2022 19:28 Inactive
@thrau thrau temporarily deployed to localstack-ext-tests June 22, 2022 19:53 Inactive
@thrau thrau temporarily deployed to localstack-ext-tests June 22, 2022 20:49 Inactive
@thrau
Copy link
Member Author

thrau commented Jun 22, 2022

looks good now: https://github.com/localstack/localstack/runs/7012412322

Copy link
Member

@giograno giograno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 especially for having config.dirs.tmp ephemeral! It saves quite some if in the pods codebase.

@localstack-bot localstack-bot temporarily deployed to localstack-ext-tests June 28, 2022 16:30 Inactive
@thrau thrau merged commit eb7d83a into v1 Jun 28, 2022
@thrau thrau deleted the directories-rework branch June 28, 2022 16:38
@github-actions github-actions bot locked and limited conversation to collaborators Jun 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants