Steven AI Talk
Stanford CS336 Lecture 9 dives into the laws that govern AI performance. We're moving from the "bigger is better" Kaplan era into the "data-rich" Chinchilla era. Key Takeaways: 🔹 Chinchilla Laws: Compute-optimal training requires ~20 tokens per parameter. 🔹 Inference-Optimal Scaling: Why models like Llama 3 are trained far beyond the Chinchilla point to save on deployment costs. 🔹 Predictability: Scaling laws allow us to project the performance of massive models using experiments that cost just a fraction. 🔹 The Data Wall: How synthetic data and quality filtering are becoming the new focus. Scaling is no longer an art—it's an engineering blueprint. Read our full technical breakdown and transcripts! All my links: https://linktr.ee/learnbydoingwithsteven [https://linktr.ee/learnbydoingwithsteven] #learnbydoingwithsteven #AI #ScalingLaws #LLM #DeepLearning #StanfordCS336 #DataScience #MachineLearning #Chinchilla #Llama3
689 afleveringen
Reacties
0Wees de eerste die een reactie plaatst
Meld je nu aan en word lid van de Steven AI Talk community!