DevOps & Cloud Interview Questions and Answers - Part 1
SCENARIO: You deploy a new ML training job requiring 8 GPUs, but pods are stuck in Pending. The K8s Scheduler logs show 'no nodes available'. Walk me through exactly what Karpenter does to resolve this, step by step. WHAT THEY'RE TESTING: K8s Scheduler vs Karpenter's role, the 4-step lifecycle THE ANSWER: • WATCH: Karpenter controller watches for pods marked 'unschedulable' by K8s scheduler • EVALUATE: Reads ALL constraints from Pod Spec: - Resource requests (8 GPUs, memory, CPU) - nodeSelector, nodeAffinity, tolerations - Topology spread constraints • PROVISION: Calls AWS EC2 API to launch instance matching ALL requirements - Selects p3.16xlarge (8 GPUs) in correct zone - Applies NodePool's taints, labels, kubelet config • RESULT: Node joins cluster, K8s scheduler binds the pod → Key insight: Karpenter provisions, K8s scheduler still does final binding!
12 episodios
Comentarios
0Sé la primera persona en comentar
¡Regístrate ahora y únete a la comunidad de DevOps & Cloud Interview Questions and Answers - Part 1!