Shared Everything

How TACC Pushed Supercomputing Toward an IO-First Architecture

37 min · 4. joulu 2025
jakson How TACC Pushed Supercomputing Toward an IO-First Architecture kansikuva

Kuvaus

On today's episode of the Shared Everything podcast, Nicole is live at SC25 with Dan Stanzione, Executive Director of the Texas Advanced Computing Center (TACC), for a look at why Horizon required a fundamental architectural reset. Stanzione explains how rising GPU power densities, liquid cooled 20 megawatt racks, and an increasingly irregular IO profile forced TACC to abandon long held assumptions about parallel filesystems. Years of watching billions of tiny files, unpredictable 4k and 64k reads, and metadata stalls slow entire machines led them to an all solid state tier and a VAST global namespace built for resilience, consistency, and shared access at scale. He describes how this model simplifies AI and hybrid scientific workflows, why the file system has always been the real point of failure, and how Horizon’s architecture reflects a world where IO, not FLOPS, determines what large scale science can do next.

Kommentit

0

Ole ensimmäinen kommentoija

Rekisteröidy nyt ja liity Shared Everything-yhteisöön!

Aloita nyt

3 kuukautta hintaan 3,99 €

Sitten 7,99 € / kuukausi · Peru milloin tahansa.

  • Podimon podcastit
  • 20 kuunteluaikaa / kuukausi
  • Lataa offline-käyttöön

Kaikki jaksot

23 jaksot

jakson How TACC Pushed Supercomputing Toward an IO-First Architecture kansikuva

How TACC Pushed Supercomputing Toward an IO-First Architecture

On today's episode of the Shared Everything podcast, Nicole is live at SC25 with Dan Stanzione, Executive Director of the Texas Advanced Computing Center (TACC), for a look at why Horizon required a fundamental architectural reset. Stanzione explains how rising GPU power densities, liquid cooled 20 megawatt racks, and an increasingly irregular IO profile forced TACC to abandon long held assumptions about parallel filesystems. Years of watching billions of tiny files, unpredictable 4k and 64k reads, and metadata stalls slow entire machines led them to an all solid state tier and a VAST global namespace built for resilience, consistency, and shared access at scale. He describes how this model simplifies AI and hybrid scientific workflows, why the file system has always been the real point of failure, and how Horizon’s architecture reflects a world where IO, not FLOPS, determines what large scale science can do next.

4. joulu 202537 min