Shared Everything

How TACC Pushed Supercomputing Toward an IO-First Architecture

37 min · 4 dec 2025
aflevering How TACC Pushed Supercomputing Toward an IO-First Architecture cover

Beschrijving

On today's episode of the Shared Everything podcast, Nicole is live at SC25 with Dan Stanzione, Executive Director of the Texas Advanced Computing Center (TACC), for a look at why Horizon required a fundamental architectural reset. Stanzione explains how rising GPU power densities, liquid cooled 20 megawatt racks, and an increasingly irregular IO profile forced TACC to abandon long held assumptions about parallel filesystems. Years of watching billions of tiny files, unpredictable 4k and 64k reads, and metadata stalls slow entire machines led them to an all solid state tier and a VAST global namespace built for resilience, consistency, and shared access at scale. He describes how this model simplifies AI and hybrid scientific workflows, why the file system has always been the real point of failure, and how Horizon’s architecture reflects a world where IO, not FLOPS, determines what large scale science can do next.

Reacties

0

Wees de eerste die een reactie plaatst

Meld je nu aan en word lid van de Shared Everything community!

Begin hier

2 maanden voor € 1

Daarna € 9,99 / maand · Elk moment opzegbaar.

  • Podcasts die je alleen op Podimo hoort
  • 20 uur luisterboeken / maand
  • Gratis podcasts

Alle afleveringen

23 afleveringen

aflevering How TACC Pushed Supercomputing Toward an IO-First Architecture artwork

How TACC Pushed Supercomputing Toward an IO-First Architecture

On today's episode of the Shared Everything podcast, Nicole is live at SC25 with Dan Stanzione, Executive Director of the Texas Advanced Computing Center (TACC), for a look at why Horizon required a fundamental architectural reset. Stanzione explains how rising GPU power densities, liquid cooled 20 megawatt racks, and an increasingly irregular IO profile forced TACC to abandon long held assumptions about parallel filesystems. Years of watching billions of tiny files, unpredictable 4k and 64k reads, and metadata stalls slow entire machines led them to an all solid state tier and a VAST global namespace built for resilience, consistency, and shared access at scale. He describes how this model simplifies AI and hybrid scientific workflows, why the file system has always been the real point of failure, and how Horizon’s architecture reflects a world where IO, not FLOPS, determines what large scale science can do next.

4 dec 202537 min