-
Notifications
You must be signed in to change notification settings - Fork 8
Conversation
gwklok
commented
Mar 19, 2018
- Remove iPXE replace with pxelinux file served via tftpd, let the initrd load the SFS layers, cuts build time, and often worker boot time. Should result in broader compatibility i.e. this fixes broadcom gigabit ethernet issues and VirtualBox fails to boot on reset bug.
- Fix typo preventing credentials for kube from getting loaded for ceph volume detach.
- Wire up Prometheus to collect ceph metrics.
- Make worker disk partitioning a bit more robust.
No longer chainload iPXE, lots of issues with e.g. broadcom gigabit ethernet cards, virtualbox unable to boot after VM reset etc. Use simpler tftpboot/pxelinux boot, the initrd can handle getting the operos SFS files over HTTP. Cuts good amount of time out of the build and workers boot slightly quicker.
When we encounter recycled disks: Remove all signatures from the disk. pvcreate gives up if there is linux raid metadata blocks on a partition so zero some blocks in that partition.
- Fix bug with kube credentials that prevented volume detach. - Wire up prometheus to collect ceph-mgr metrics.
rlisagor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my branch (https://github.com/rlisagor/operos/tree/upgrade), certain files are very different - e.g. the installer scripts, start-addons, etc. So some of these changes might have to be adapted/rewritten.
| cat /etc/ceph/ceph.conf | etcd_cmd put "cluster/$OPEROS_INSTALL_ID/ceph-config" | ||
| /usr/bin/ceph auth get client.kube | etcd_cmd put "cluster/$OPEROS_INSTALL_ID/secret-ceph-kube-keyring" | ||
|
|
||
| wait_for_unit 5 ceph-mgr@${CHOSTNAME} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe systemctl start already waits for the service to become active, so this shouldn't be necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right, going to have to think of something else or go back to the sleep hammer, this works because it inserts a delay long enough between systemd starting the service and it actually becoming ready, might be able to check with commands but they usually have dozens of seconds timeouts
| /usr/bin/ceph config-key set mgr/prometheus/server_addr ${OPEROS_CONTROLLER_IP} | ||
|
|
||
| systemctl enable ceph-mgr@${CHOSTNAME}.service | ||
| systemctl start ceph-mgr@${CHOSTNAME}.service |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: another way to enable and start at the same time is systemctl enable --now ceph-mgr@${CHOSTNAME}.service
| # tftp | ||
| cat > /mnt/etc/conf.d/tftpd <<EOF | ||
| TFTPD_ARGS="--verbose --address ${OPEROS_CONTROLLER_IP} -m /tftpboot/mapfile -u ftp --secure /tftpboot" | ||
| TFTPD_ARGS="--verbose --address ${OPEROS_CONTROLLER_IP} -m /etc//tftpd.mapfile -u ftp --secure /boot" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/etc//tftpd.mapfile -> /etc/tftpd.mapfile ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup fixing