Scaling Intelligence
In this episode of Scaling Intelligence, Kevin Jackson speaks with Raphaël Brochard, General Manager of Energy Aware Runtime (EAR) — a spin-off from Barcelona Supercomputing Center — about why energy efficiency is becoming a hard constraint for AI and HPC data centers. EAR delivers real-time, vendor-agnostic energy optimization across CPUs, GPUs, and I/O — automatically reducing GPU energy use by ~20% on AI workloads, with up to ~50% possible through deeper tuning. Raphaël explains how the system works, where it sits in the stack, and what operators at sites like LRZ, EDF, and SuperMUC-NG have learned from deploying it. The conversation also explores how energy visibility often reveals hidden inefficiencies, and why topics like smart power capping, peak demand penalties, grid constraints, and carbon-aware scheduling are becoming increasingly important. Finally, they discuss where energy management in large-scale compute is heading — and why energy is quickly becoming a first-class constraint in AI and HPC operations.
13 episodios
Comentarios
0Sé la primera persona en comentar
¡Regístrate ahora y forma parte de la comunidad de Scaling Intelligence!