Your AI, Your Way
Most enterprises invest in high-end compute and connect it to network infrastructure that was designed for office traffic. Email, file shares, video calls. The assumption is that if the GPUs are powerful enough, the rest of the stack will keep up. It does not. A GPU can only process data as fast as the network delivers it. When the network runs at 200 gigabit and the workload demands 800, compute sits idle. You pay for the race car engine but starve it of fuel. For a basic chatbot, you can get away with it. Occasional prompts create short spikes. The system holds. But agentic AI runs at continuous peak load. The network speeds involved are moving from 200 gigabit to 400, to 800, and the next generation targets 1.6 terabit. Most enterprise IT teams have never worked at these levels. Meanwhile, the metric that will define AI economics is one most companies do not track yet: cost per token. In this discussion recorded at Cisco Studio Amsterdam, Sander ten Hoedt (Cisco) and Raymond Drielinger (MDCS.AI [http://MDCS.AI]) explain why AI infrastructure behaves like a production line, and why that production line fails when the data flow cannot keep up. Key topics include: * Why connecting expensive GPUs to an office network is like putting a Formula 1 engine in a car with narrow fuel lines. * How GPU utilization is often a network problem, not a compute problem, and why that directly drives up cost per token. * The difference between chatbot traffic and agentic AI workloads, and why the latter demands a fundamentally different infrastructure philosophy. * Why most enterprises do not yet measure cost per token, and why pharma and financial services are ahead of the curve. * Infrastructure checks every organization should run before scaling AI beyond pilot.
8 jaksot
Kommentit
0Ole ensimmäinen kommentoija
Rekisteröidy nyt ja liity Your AI, Your Way-yhteisöön!