AI Bites: The Academic Series
A bite-sized, visual breakdown of CS224N Lecture 14! In this new NotebookLM Video Short, we pull back the curtain on the invisible preprocessing layer of modern AI: Tokenization. Key Topics: * The "Strawberry" Problem: A visual look at why ChatGPT can't count letters or spell backwards due to opaque token chunks. * The Multilingual Tax: A direct comparison showing how English-biased tokenizers shatter non-English prompts (like Thai or Somali) into dozens of inefficient fragments, forcing global users to pay more money for worse AI performance. * The Return to Bytes: A quick look at next-generation architectures (like Google's CANINE and MrT5) that dynamically drop bytes to fix this massive inequality. Note: This is an AI-generated visual discussion created using Google's NotebookLM, based on publicly available Stanford University course material (specifically CS224N) and personal study notes from my learning journey.
55 episoder
Kommentarer
0Vær den første til at kommentere
Tilmeld dig nu og bliv en del af AI Bites: The Academic Series-fællesskabet!