Thanks to visit codestin.com
Credit goes to github.com

Skip to content
This repository was archived by the owner on Sep 3, 2022. It is now read-only.

Conversation

ojarjur
Copy link
Contributor

@ojarjur ojarjur commented Aug 23, 2018

Apparently, Container Optimized OS stores the users database in a temp
filesystem, causing it to get lost and recreated every time an
instance is restarted.

Among the many important things recorded in that database is the
mapping from user names (e.g. datalab) to user numeric
IDs (e.g. 2000). By recreating the users database on every restart,
that mapping can change seemingly randomly.

For instance, the datalab user can have an ID of 2000 on one boot,
with the logger user having an ID of 2001, and after rebooting the
instance those numbers could be reversed: datalab having a user ID
of 2001 and logger having a user ID of 2000.

Since file ownership is defined in terms of user ID, this means that
the owner of files under each home directory can change randomly every
time an instance is rebooted.

That, in turn, causes datalab connect calls to fail, as the SSH
tunnel cannot be created if the datalab user cannot log in.

This change fixes that problem by making the file ownership of the
/home/datalab and /home/logger directories stable. That is done by
attempting to assign those two users consistent UIDs, and then forcing
the file ownership to match the corresponding users even if the UID
has changed.

Changing the startup script in the create.py file is sufficient to
do this for both regular and gpu-enabled instances, as GPU instances
no longer have their own startup-script extensions. This change
also removes the structure that was previously used for startup-script
extensions in order to make clear the fact that they are no longer
used.

This fixes #2014

Apparently, Container Optimized OS stores the users database in a temp
filesystem, causing it to get lost and recreated every time an
instance is restarted.

Among the many important things recorded in that database is the
mapping from user names (e.g. `datalab`) to user numeric
IDs (e.g. `2000`). By recreating the users database on every restart,
that mapping can change seemingly randomly.

For instance, the `datalab` user can have an ID of `2000` on one boot,
with the `logger` user having an ID of `2001`, and after rebooting the
instance those numbers could be reversed: `datalab` having a user ID
of `2001` and `logger` having a user ID of `2000`.

Since file ownership is defined in terms of user ID, this means that
the owner of files under each home directory can change randomly every
time an instance is rebooted.

That, in turn, causes `datalab connect` calls to fail, as the SSH
tunnel cannot be created if the `datalab` user cannot log in.

This change fixes that problem by making the file ownership of the
`/home/datalab` and `/home/logger` directories stable. That is done by
attempting to assign those two users consistent UIDs, and then forcing
the file ownership to match the corresponding users even if the UID
has changed.

Changing the startup script in the `create.py` file is sufficient to
do this for both regular and gpu-enabled instances, as GPU instances
no longer have their own startup-script extensions. This change
also removes the structure that was previously used for startup-script
extensions in order to make clear the fact that they are no longer
used.

This fixes #2014
Copy link
Contributor

@qimingj qimingj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed write-up!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

File permissions for the datalab user getting corrupted, which blocks the datalab connect command from working.
3 participants