Developer boxes

The purpose of this repository is to make it easy to run a development box on GKE. Reasons for moving development into a container

Needing more resources (CPU/RAM/GPU) than your local machine
Needing a different operating system/architecture in order to compile code
- TensorFlow federated doesn't work on M1 tensorflow/federated#1254

Architecture

The solution consists of the following pieces

A statefulset for running the container
- We use a statefulset because this gives the pod a stable name
A PVC for storing the home directory and other files
- This ensures data isn't lost between pod restarts
- This also means we can teardown the statefulset in order to save compute costs and then restart
Run tailscale in a sidecar to add the pod to your mesh to make it connectable from outside the cluster
- this makes it easy to connect to the pod including Jupyter
An ssh server running inside the main container
- This can be used with VSCode over ssh to run VSCode on your local machine but edit/run code inside the container

SSH keys

An SSH key is needed for two purposes

Connect to GitHub from the container to push/pull code
- Private key needs to be stored in the server
Connect to the container via ssh from your local machine (e.g. for VSCode)
- Private key needs to be stored on your local machine

We can use the same key for both. We use a K8s secret to make the key available to the pod. We also use a configmap to set the SSH authorized keys on the pod.

Setup SSH keys

ssh-keygen -t ed25519 -C "[email protected]"

Save the key to ${HOME}/.ssh/devbox

Don't set a passphrase

Add the public key to your GitHub ssh keys.

Create a k8s secret

kubectl create secret generic ${USER}-ssh --from-file=id_ed25519=${HOME}/.ssh/devbox --from-file=id_ed25519.pub=${HOME}/.ssh/devbox.pub

We mount the ssh keys into /secrets rather than ${HOME}/.ssh. We do this because due to kubernetes/kubernetes#81089 its not clear if

the directory ${HOME}/.ssh will end up being owned by the user the container is running
will be writable (e.g. for known hosts)

If instead we let ${HOME}/.ssh be stored on the persistent volume then we can easily do manual setup and have it persist across reboots

The startup script startup.sh starts ssh-agent ands the key in /secrets.

Create a secret containing the SSH keys authorized to ssh into the container. This will be the public key(s) of the SSH keys on the machines you will be ssh'ing from

kubectl create secret generic ${USER}-auth-keys --from-file=authorized_keys=${HOME}/.ssh/id_ed25519.pub

ssh server

We start an ssh server for use with vscode. The ssh server runs on port 2222 because it runs as user jupyter and therefore can't bind port 22 which is privileged.

For more info see Run SSHD as non root user

Configure VSCode

Follow the instructions for Remote development using ssh

You will need to edit your host settings in `~/.ssh/config to set username and port like this

Host 100.92.148.119
  HostName 100.92.148.119
  User jupyter
  Port 2222

The hostname should be the ip addressed assigned by tailscale.

PVC and Home

To store the home directory and other files on durable storage we do the following

Mount a PVC at /storage
Set HOME to /storage/jupyter

We hit a couple issues that led to this approach as opposed to

Mounting the PVC at /home/jupyter
Mounting the PVC at /home

When we mounted the PVC at /home/jupyter the user/group permissions of the drive caused SSH to complain when using the SSH keys in /home/jupyter/.ssh to allow ssh'ing into the pod. SSH likes the home directory to only be readable by the user. However, it looks like the owner of the directory at which the PVC is mounted is root.

Mounting the PVC one level higher at /home fixed this. However, I observed that some other ephmeral volume was being mounted at /home/jupyter and therefore the home directory wasn't actually on PVC. This was evident from running mount

/dev/sdb on /home type ext4 (rw,relatime)
/dev/sda1 on /home/jupyter type ext4 (rw,nosuid,nodev,relatime,commit=30)

Inspecting the docker image using crane indicates that the Dockerfile adds the Volume /home/jupyter. I think this causes a volume to be mounted there by the kubelet if no volume is explicitly mounted.

Building with Kaniko vs. GCB

I originally tried Kaniko but ran into issue GoogleContainerTools/skaffold#7701 with not being able to increase ephmeralStorage. So I switched to GCB.

With GCB I had to use a 32 CPU machine; it was timing out trying to push the image with 8 CPU.

Troublehsooting

SSH timeout

Try confirm if this is an issue with tailscale or ssh.

Check the ssh logs /tmp/sshd.log
Try ssh'ing to the pod from within the pod itself
- i.e. use kubectl exec to start a shell in the pod and then run ssh in the pod
- If this succeeds then it is most likely a networking issue with tailscale
Try starting an HTTP server on the pod and seeing if you can connect to it
```
python -m http.server 8000
```
If it appears to be an issue with tailscale then login to tailscale and try removing the device in tailscale
Check the logs of tailscale; there will most likely be a link to authenticate to tailscale
- Use the link to reauthenticate

Starting an HTTP server

You can start an http server on the dev box by doing

python3 -m http.server 8000

This is useful if your trying to debug ssh issues and need to figure out whether its an ssh issue or a network/tailscale issue.

skaffold build & GCB - error copying logs to stdout

When using skaffold with GCB skaffold build exits with error

error copying logs to stdout: invalid write result

Running skaffold build with verbose logging e.g. skaffold build -v apears to fix this.

ssh'ing into the node hanges

Make sure you include jupyter e.g. ssh jupyter@HOSTNAME -p 2222

References

Run SSHD as non root user

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
dlbase		dlbase
gcbtest		gcbtest
vscode		vscode
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
design_guide.md		design_guide.md
kaniko-ns.yaml		kaniko-ns.yaml
tailscale-role.yaml		tailscale-role.yaml
wi-test.yaml		wi-test.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Developer boxes

Architecture

SSH keys

Setup SSH keys

ssh server

Configure VSCode

PVC and Home

Building with Kaniko vs. GCB

Troublehsooting

SSH timeout

Starting an HTTP server

skaffold build & GCB - error copying logs to stdout

References

About

Uh oh!

Releases

Packages

Uh oh!

Languages

jlewi/devboxes

Folders and files

Latest commit

History

Repository files navigation

Developer boxes

Architecture

SSH keys

Setup SSH keys

ssh server

Configure VSCode

PVC and Home

Building with Kaniko vs. GCB

Troublehsooting

SSH timeout

Starting an HTTP server

skaffold build & GCB - error copying logs to stdout

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages