Thanks to visit codestin.com
Credit goes to github.com

Skip to content

cube-k8s/k8s-bare-metal

Repository files navigation

Bare Metal Kubernetes Cluster - Ansible Automation

This project provides complete Ansible automation for deploying a bare metal Kubernetes cluster from scratch ("The Hard Way") on QEMU virtual machines running Debian.

Current Status

Fully Operational Cluster

  • Control plane components deployed and running
  • 3 worker nodes joined and ready
  • Flannel CNI providing pod networking
  • CoreDNS deployed and functional
  • NGINX Ingress Controller ready to deploy
  • kubectl access configured

Architecture Overview

  • Cluster: 1 master node + 3 worker nodes
  • Network: 10.10.10.0/24
  • CNI: Flannel (VXLAN mode)
  • Pod CIDR: 10.244.0.0/16
  • Service CIDR: 10.96.0.0/12
  • Kubernetes Version: 1.28.0

Prerequisites

Control Machine (where Ansible runs)

  • Ansible 2.12 or higher
  • Python 3.8 or higher
  • SSH access to all cluster nodes
  • CFSSL tools (installed automatically by playbooks)

Target Nodes

  • Debian 11 or 12 (fresh installation)
  • Root SSH access configured
  • Network connectivity on 10.10.10.0/24
  • Minimum 2 CPU cores and 4GB RAM per node

Project Structure

.
├── ansible.cfg                 # Ansible configuration
├── inventory/                  # Inventory and variables
│   ├── hosts.yml              # Node definitions
│   └── group_vars/            # Group-level variables
│       └── all.yml            # Cluster-wide settings
├── playbooks/                  # Step-by-step deployment playbooks
│   ├── 01-prepare-nodes.yml
│   ├── 02-generate-certificates.yml
│   ├── 03-setup-etcd.yml
│   ├── 04-setup-control-plane.yml
│   ├── 05-setup-workers.yml
│   ├── 06-setup-flannel.yml
│   ├── 07-setup-coredns.yml
│   └── 99-setup-local-kubectl.yml
├── roles/                      # Ansible roles
│   ├── common/                # Node preparation
│   ├── pki/                   # Certificate generation
│   ├── etcd/                  # etcd cluster
│   ├── control-plane/         # API server, controller, scheduler
│   ├── worker/                # kubelet and kube-proxy
│   ├── flannel/               # Flannel CNI
│   ├── cilium/                # Cilium CNI (alternative)
│   ├── metallb/               # Load balancer (future)
│   ├── kong/                  # API Gateway (future)
│   ├── storage/               # Storage provisioner (future)
│   └── observability/         # Monitoring stack (future)
├── docs/                       # Documentation
│   ├── NGINX_INGRESS_GUIDE.md
│   └── CONTAINERD_CONFIG.md
├── kubectl.sh                  # kubectl helper script
└── kubeconfig                  # kubectl configuration

Quick Start

1. Configure Inventory

Edit inventory/hosts.yml with your node IP addresses:

all:
  children:
    k8s_cluster:
      children:
        master:
          hosts:
            k8s-master:
              ansible_host: 10.10.10.101
              ansible_user: root
        workers:
          hosts:
            k8s-worker-01:
              ansible_host: 10.10.10.102
              ansible_user: root
            k8s-worker-02:
              ansible_host: 10.10.10.103
              ansible_user: root
            k8s-worker-03:
              ansible_host: 10.10.10.104
              ansible_user: root

2. Configure Variables

Edit inventory/group_vars/all.yml with your cluster settings:

kubernetes_version: "1.28.0"
pod_network_cidr: "10.244.0.0/16"
service_cidr: "10.96.0.0/12"
cluster_name: "cube-k8s"

3. Deploy Cluster

Deploy step-by-step (recommended for first-time deployment):

# 1. Prepare nodes (install containerd, configure system)
ansible-playbook -i inventory/hosts.yml playbooks/01-prepare-nodes.yml

# 2. Generate PKI certificates
ansible-playbook -i inventory/hosts.yml playbooks/02-generate-certificates.yml

# 3. Setup etcd cluster
ansible-playbook -i inventory/hosts.yml playbooks/03-setup-etcd.yml

# 4. Setup control plane (API server, controller, scheduler)
ansible-playbook -i inventory/hosts.yml playbooks/04-setup-control-plane.yml

# 5. Setup worker nodes (kubelet, kube-proxy)
ansible-playbook -i inventory/hosts.yml playbooks/05-setup-workers.yml

# 6. Deploy Flannel CNI
ansible-playbook -i inventory/hosts.yml playbooks/06-setup-flannel.yml

# 7. Deploy CoreDNS
ansible-playbook -i inventory/hosts.yml playbooks/07-setup-coredns.yml

# 8. Deploy NGINX Ingress Controller (optional)
ansible-playbook -i inventory/hosts.yml playbooks/08-setup-nginx-ingress.yml

# 9. Setup local kubectl access
ansible-playbook -i inventory/hosts.yml playbooks/99-setup-local-kubectl.yml

Verification

After deployment, verify the cluster:

# Check node status
./kubectl.sh get nodes

# Check all pods
./kubectl.sh get pods -A

# Check system components
./kubectl.sh get componentstatuses

# Test pod deployment
./kubectl.sh create deployment nginx --image=nginx:alpine --replicas=3
./kubectl.sh get pods -o wide

# Test DNS resolution
./kubectl.sh run test-dns --image=busybox:1.28 --rm -it --restart=Never -- nslookup kubernetes.default

# Test ingress controller (if deployed)
./kubectl.sh get pods -n ingress-nginx
./kubectl.sh get svc -n ingress-nginx

Testing NGINX Ingress Controller

If you deployed the ingress controller, test it with the example app:

# Deploy test application
./kubectl.sh apply -f examples/nginx-ingress/test-app.yaml

# Add to /etc/hosts
echo "10.10.10.102  hello.local" | sudo tee -a /etc/hosts

# Test the ingress
curl http://hello.local:30080

# Clean up
./kubectl.sh delete -f examples/nginx-ingress/test-app.yaml

See examples/nginx-ingress/README.md for more examples.

Cluster Architecture

Control Plane (10.10.10.101)
├── etcd (v3.5.9)
├── kube-apiserver (v1.28.0)
├── kube-controller-manager (v1.28.0)
└── kube-scheduler (v1.28.0)

Worker Nodes
├── k8s-worker-01 (10.10.10.102)
├── k8s-worker-02 (10.10.10.103)
└── k8s-worker-03 (10.10.10.104)

Each worker runs:
├── kubelet (v1.28.0)
├── kube-proxy (v1.28.0)
├── containerd (v2.2.0)
└── flannel (v0.28.0)

Network Configuration

  • Pod CIDR: 10.244.0.0/16 (Flannel VXLAN)
  • Service CIDR: 10.96.0.0/12
  • CNI: Flannel in VXLAN mode
  • Pod CIDRs per node:
    • k8s-worker-01: 10.244.1.0/24
    • k8s-worker-02: 10.244.2.0/24
    • k8s-worker-03: 10.244.3.0/24

Components

Core Components (Deployed)

etcd

  • Distributed key-value store for cluster state
  • 3-node cluster for high availability
  • TLS-secured communication

Control Plane

  • kube-apiserver: REST API for cluster management
  • kube-controller-manager: Manages controllers (replication, endpoints, etc.)
  • kube-scheduler: Assigns pods to nodes

Worker Components

  • kubelet: Node agent that manages pods
  • kube-proxy: Network proxy for service abstraction
  • containerd: Container runtime (CRI-compatible)

Networking

  • Flannel CNI: Pod networking using VXLAN overlay
  • CoreDNS: Cluster DNS service
  • NGINX Ingress Controller: HTTP/HTTPS routing (optional)

Future Components (Ready to Deploy)

Cilium CNI (Alternative to Flannel)

  • eBPF-based networking
  • BGP control plane for route advertisement
  • Hubble for network observability
  • Can replace Flannel for advanced features

MetalLB

  • BGP mode for dynamic route advertisement
  • LoadBalancer service type for bare metal
  • IP pool: 10.10.10.150-10.10.10.200

Kong Gateway

  • Kubernetes Gateway API v1 implementation
  • DB-less (declarative) mode
  • Advanced features: rate limiting, auth, transformations

Local Path Provisioner

  • Dynamic PV provisioning using local storage
  • Default storage class
  • Node-local volumes

Observability Stack

  • Prometheus: Metrics collection
  • Grafana: Visualization dashboards
  • Loki: Log aggregation

Troubleshooting

Nodes Not Ready

# Check kubelet status on a worker
ssh [email protected] "systemctl status kubelet"

# Check kubelet logs
ssh [email protected] "journalctl -u kubelet -f"

Pod Network Issues

# Check Flannel pods
./kubectl.sh get pods -n kube-flannel

# Check Flannel logs
./kubectl.sh logs -n kube-flannel -l app=flannel

# Test pod-to-pod connectivity
./kubectl.sh run test1 --image=busybox --restart=Never -- sleep 3600
./kubectl.sh run test2 --image=busybox --restart=Never -- sleep 3600
POD2_IP=$(./kubectl.sh get pod test2 -o jsonpath='{.status.podIP}')
./kubectl.sh exec test1 -- ping -c 3 $POD2_IP

Control Plane Issues

# Check control plane component status
ssh [email protected] "systemctl status kube-apiserver kube-controller-manager kube-scheduler"

# Check API server logs
ssh [email protected] "journalctl -u kube-apiserver -n 50"

Certificate Issues

# Regenerate certificates
ansible-playbook -i inventory/hosts.yml playbooks/02-generate-certificates.yml

# Check certificate expiration
ssh [email protected] "openssl x509 -in /etc/kubernetes/pki/apiserver.pem -noout -dates"

Documentation

Using kubectl

Use the helper script for easy kubectl access:

./kubectl.sh get nodes
./kubectl.sh get pods -A
./kubectl.sh run test --image=nginx:alpine

Or set the KUBECONFIG environment variable:

export KUBECONFIG=$(pwd)/kubeconfig
kubectl get nodes

See docs/KUBECTL_ACCESS.md for more details.

Resetting the Cluster

To completely remove all Kubernetes components and start fresh:

ansible-playbook -i inventory/hosts.yml reset-cluster.yml

This will:

  • Stop all Kubernetes services
  • Remove all binaries and configuration
  • Clean up data directories
  • Reset iptables rules
  • Preserve containerd for redeployment

Next Steps

The cluster is ready for additional components:

  1. MetalLB - Load balancer for bare metal (BGP mode)
  2. Cilium - Advanced CNI with eBPF and BGP (can replace Flannel)
  3. Kong Gateway - API Gateway with Gateway API v1
  4. Storage - Persistent storage provisioner
  5. Observability - Prometheus, Grafana, Loki stack

License

This project is for educational and laboratory purposes.

Contributing

This is a personal lab project. Feel free to use it as a reference for your own bare-metal Kubernetes deployments.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors