Shared Everything

How TACC Pushed Supercomputing Toward an IO-First Architecture

37 min · 4. Dez. 2025
Episode How TACC Pushed Supercomputing Toward an IO-First Architecture Cover

Beschreibung

On today's episode of the Shared Everything podcast, Nicole is live at SC25 with Dan Stanzione, Executive Director of the Texas Advanced Computing Center (TACC), for a look at why Horizon required a fundamental architectural reset. Stanzione explains how rising GPU power densities, liquid cooled 20 megawatt racks, and an increasingly irregular IO profile forced TACC to abandon long held assumptions about parallel filesystems. Years of watching billions of tiny files, unpredictable 4k and 64k reads, and metadata stalls slow entire machines led them to an all solid state tier and a VAST global namespace built for resilience, consistency, and shared access at scale. He describes how this model simplifies AI and hybrid scientific workflows, why the file system has always been the real point of failure, and how Horizon’s architecture reflects a world where IO, not FLOPS, determines what large scale science can do next.

Kommentare

0

Sei die erste Person, die kommentiert

Melde dich jetzt an und werde Teil der Shared Everything-Community!

Loslegen

2 Monate für 1 €

Dann 4,99 € / Monat · Jederzeit kündbar.

  • Podcasts nur bei Podimo
  • 20 Stunden Hörbücher / Monat
  • Alle kostenlosen Podcasts

Alle Folgen

23 Folgen

Episode How TACC Pushed Supercomputing Toward an IO-First Architecture Cover

How TACC Pushed Supercomputing Toward an IO-First Architecture

On today's episode of the Shared Everything podcast, Nicole is live at SC25 with Dan Stanzione, Executive Director of the Texas Advanced Computing Center (TACC), for a look at why Horizon required a fundamental architectural reset. Stanzione explains how rising GPU power densities, liquid cooled 20 megawatt racks, and an increasingly irregular IO profile forced TACC to abandon long held assumptions about parallel filesystems. Years of watching billions of tiny files, unpredictable 4k and 64k reads, and metadata stalls slow entire machines led them to an all solid state tier and a VAST global namespace built for resilience, consistency, and shared access at scale. He describes how this model simplifies AI and hybrid scientific workflows, why the file system has always been the real point of failure, and how Horizon’s architecture reflects a world where IO, not FLOPS, determines what large scale science can do next.

4. Dez. 202537 min