An upcoming tutorial at SC24. To be presented by Mahidhar Tatineni, Dmitry Mishin, Carlos Arango Guttierez, and Angel Beltre.
MORNING:
*Intro and Welcome
*Kubernetes Intro and Architecture
- Hands On: Intro to Kubernetes
*Containerized software stack
*Kubernetes Scheduling
- Hands On: Understanding Scheduling
*Interactive computing using Kubernetes
*Usage Workflow in Kubernetes
- Hands On: Realistic compute workflow
*AI and computational science research applications with Kubernetes
AFTERNOON:
*Hands on: AI Examples
- AI training using PyTorch example
- Text generation inference example
- RAG example using Ollama
- Helm based deployment of LLM as service
*Persistent Storage and I/O considerations for complex workloads
- Hands On: Storage
*Introduction to GPU and MPI Operators in Kubernetes
- Hands on: GPU and MPI implementation examples
*Job Monitoring in Kubernetes
*Q/A with attendees, discussions on custom requirements