This repository is a collection of tooling, docs and configuration that defines my homelab and specific purpose nodes. Currently this boils down to:
homeservercluster, serving general-purpose applicationshomeserver-backupcluster, keeping backups from the aboveprintserver, which enables wireless access to a USB-only printer
Key takeaways:
ingress-nginxon entry, withcert-manager+Let's Encrypt(DNS based challenges in Route53) backed SSL,oauth2-proxyfor non-OIDC-native servicesvaultfor centralized identity and secrets managementlonghornused for storage with daily backups of "important" volumes in a separate cluster
Clusters live in a separate cluster VLAN defined in a network_layout repository.
Mentioned repo also defines the VPN setup and IPs assignment.
All traffic to *.<DOMAIN> and *.backup.<DOMAIN> is redirected to specific cluster LBs on a router/VPN level.
Four Dell Optiplex nodes, totaling 24 cores, 256G of RAM and 24T of storage (1x2T NVMe + 1x4T SATA SSD on each node).
Nodes mounted in a 10″ rack using the 3d printed frames with minor modifications (TODO: upstream model changes).
Three RPis 4B, totaling 12 cores, 24G of RAM and 3T of storage (1x1T M.2 SATA attached over USB on each node).
Pis are mounted in a 10″ rack using the 3d printed frames. Power is provided via official PoE+ hats.
RPi Zero 2 W, with OTG splitter for a USB-A type port.
In other words, what needs to be done when you lay your hands on a new machine. As a rule of thumb this only has to be done once.
- Update the bootloader and make it boot from the USB first.
RPi Imager>Bootloader>USB Boot - Flash the official Raspberry Pi OS (64-bit) image and make sure that it works fine. You can use this step to run
raspi-configand set WLAN country
- Run extended diagnostic suite
- Update BIOS
- Run extended diagnostic suite again
We want to use a bootstrap image that is ready to be provisioned with ansible without requiring any user interaction first.
To achieve it, we need to:
- create bootstrap user
ansible_bootstrapwith passwordlesssudoprivileges - provide public SSH key to be added to
authorized_keys - setup minimal required SSH hardening (deny password authentication, deny root login, only allow public key based logins)
Scripts are provided to prepare such image
Currently Raspbian is used for RPi nodes (because of the OOTB support for PoE+ hat fans), while Dell nodes use Debian.
Take a look at corresponding build_ scripts in image_build directory for more details.
Few useful variables:
HOST_SSH_PUB_KEYS_FILEpoints to a pubkey that should be added toauthorized_keyson the targetLUKS_PASSWORD(build_debianspecific) if provided, will be used for full disk encryption. Defaults to obtaining the password from password manager
Required packages on host for the build to succeed:
vagrant(builds are performed in VMs for better interoperability)ssh
Built images can be found in image_build/output directory.
If you were to use an official image you would have to perform the user, SSH and (optionally) LUKS setup manually.
In later steps Ansible will make sure that SSH config is properly hardened and ansible_bootstrap user is removed.
When you have the image on hand you can flash it to the drive using the tool of your choice, e.g. with dd
# dd if=<path to the image> of=<path to your SSD> bs=64k oflag=dsync status=progress
or using a tool like rufus or etcher.
Create a bootable USB drive or upload the file to a TFTP server to perform netboot. Afterwards, install the system as usual. Beware, you have to use the Install (not graphical) option for the preseed file to be taken into account.
The preseed file responsible for the initial setup is burned into the image itself.
This part is responsible for most of the software provisioning.
The idea is to ensure that core blocks are in place, for example:
- users
- firewall
- access restriction, e.g. via SSH
- required dependencies
- container runtime
This step also removes the ansible_bootstrap user and initializes Kubernetes clusters.
To provision the nodes:
- Enter the
ansibledirectory - Set up the workspace with
poetry install - Get dependencies via
poetry run ansible-galaxy install -r requirements.yml - Run the
poetry run ansible-playbook site.yml - (Dell specific) To provision secondary SATA SSD drives (if present), run
poetry run ansible-playbook site.yml -l <host> -t storage-setup
k8s.yml playbook relies on the proper kubeconfig with homeserver and homeserver-backup contexts being present on the host.
During the initial provisioning you may want to skip the k8s tag or just let it fail, set up the config and rerun the playbook.
Take a look at inventory.yml and site.yml for supported options.
Most notably passwords that will be set for the newly created users and secondary drive encryption are obtained from the password manager by default.
At the very beginning obtain kubeconfig via scp server@<node>:/etc/rancher/{rke2/k3s}/{rke2/k3s}.yaml kubeconfig.yaml.
You will have to modify the server field in the kubeconfig so it points to a remote node and not 127.0.0.1 (which is the default).
It's assumed that homeserver and homeserver_backup have corresponding contexts created under
the names homeserver and homeserver-backup respectively
Required tools:
kubectlhelmhelmfileterragruntterraform
Few important charts that will be deployed in this step:
cert-managerfor certificates generation (Route53 DNS solver under the hood)ingress-nginxfor reverse proxyingvictoria-metrics-k8s-stackfor monitoring, configured with PagerDuty and Dead Man's Snitchvaultfor secrets and identity managementoauth2-proxyfor OIDC support for applications that do not support it nativelylonghornfor distributed storage
All the cluster related configuration is stored under helmfile directory.
Different directories are to be used depending on the cluster.
Below instructions define how to perform a full (from scratch) deployment
- cd to
helmfile/core - run
DOMAIN=<your domain> helmfile sync - cd to
helmfile/vault-terraform - run
terragrunt apply - cd to
helmfile/services - run
DOMAIN=<your domain> helmfile sync
While the steps above cover the deployment, there's some special treatment needed to initialize vault from scratch.
Please follow the helmfile/vault-terraform/vault-setup.md.
Make sure that you have provided the required values for helmfile/vault-terraform/terraform.tfvars.
This cluster largely depends on the homeserver setup, e.g. for auth.
Make sure that the above cluster is deployed and ready first
- cd to
helmfile/backup - run
DOMAIN=<your domain> helmfile sync
Currently the ansible playbook takes care of:
- setting up the CUPS server
- installing (properiatary) drivers for HP LaserJet Pro P1102 printer
It requires the printer to be connected to the device when the playbook is being applied.