What The Tech
In early 2025, a small Chinese startup called DeepSeek sent a shockwave through the technology sector. Their R1 chatbot didn't just rival the performance of the world’s most advanced reasoning models—it did so at a fraction of the cost, wiping $600 billion off Nvidia's market cap in a single day. This episode deconstructs the controversial "computer science trick" at the heart of this disruption: Knowledge Distillation. We explore how the industry is moving away from "Foundational Giants" that cost hundreds of millions to train, toward a more democratized era where startups and independent researchers can finally compete with the tech titans. We dive into the "Teacher-Student" process that makes this possible. Imagine a master chef teaching a student not just to follow a recipe, but to understand the "intuition" behind every ingredient. We explain the difference between White-Box Distillation, which accesses a model's internal "soft targets" or "dark knowledge," and the more elusive Black-Box Distillation used by DeepSeek to mimic a teacher's behavior through millions of targeted questions. As we look toward 2026, we tackle the growing legal and ethical gray areas of this "AI shrinking ray": what happens to intellectual property when a company’s public-facing product effectively becomes the training data for its competitors?
8 episodios
Comentarios
0Sé la primera persona en comentar
¡Regístrate ahora y únete a la comunidad de What The Tech!