Tech Beats Unplugged
👋🏼 Hey AI heads 🎙️ Join us for the very first Tech Beats Live 🔴, hosted by Kosseila—aka @CloudDude from @CloudThrill. 🎯 This chill & laid-back livestream will unpack LLM quantization 🔥: * ✅ WHY it matters * ✅ HOW it works * ✅ Enterprise (vLLM) vs Consumer (@Ollama) trade-offs * ✅ and WHERE it’s going next. We’ll be joined by two incredible guest stars to talk Enterprise vs Consumer Quantz 🗣️: 🔷 Eldar Kurtić – bringing the enterprise perspective with vLLM. 🔷 Colin Kealty – aka Bartowski, creator of the top-downloaded GGUF quantized LLMs on Hugging Face. 🫵🏼 Come learn and have some fun 😎. 𝐂𝐡𝐚𝐩𝐭𝐞𝐫𝐬: (00:00) Host Introduction (04:07) Eldar Intro (07:33) Bartowski Intro (13:04) What’s Quantization! (16:19) Why LLM Quantization Matters? (20:39) Training vs Inference – “The New Deal” (27:46) Biggest Misconception About Quantization (33:22) Enterprise Quantization in Production (vLLM) (48:48) Consumer LLMs & Quantization (Ollama, llama.cpp, GGUF) – “LLMs for the People” (01:06:45) BitNet 1-Bit Quantization from Microsoft (01:28:14) How Long It Takes to Quantize a Model (Llama-3 70B) – GGUF or lm-compressor (01:34:23) What Is I-Matrix & Why People Confuse It with IQ Quantization? (01:39:36) What’s LoRA & LoRA-Q? (01:42:36) What Is Sparsity? (01:47:42) What Is Distillation? (01:52:34) Extreme Quantization (Unsloth) of Big Models (DeepSeek) at 2-bits 70 % Size Cut (01:57:27) Will Future Models (Llama-5) Be Trained on FP4 Tensor Cores? (02:02:15) The Future of LLMs on Edge Devices (Google AI Edge) (02:08:00) How to Evaluate the Quality of a Quantized Model (02:26:09) Hugging Face’s Role in the World of LLM/Quantization (02:33:46) Hugging Face’s Role in the World of LLM/Quantization (02:36:41) LocalLlama Sub-Reddit Down (Moderator Goes Bananas) (02:40:11) Guests’ Hope for the Future of LLMs & AI in General 📖 Check out the quantization blog: https://bitly/LLMQuant [https://www.youtube.com/redirect?event=video_description&redir_token=QUFFLUhqbGZlNEtpUW1VZU9XUFhucG5Tbng2aUtabVNtQXxBQ3Jtc0tuYWNXVDBsaWpLaHVxRDhRZklqNjZhX0tTYWF6cFA2UC1VYU1OSW1Fc2N4SUF1WEJremZxZXc5U3hSMEZkLXhwUDV5Z3dCOFFXc2oyc1I1d1gtU2FuSVRhWE1jem12OGUxbnYzS2dITURQQVhaMHVwOA&q=https%3A%2F%2Fcloudthrill.ca%2Fllm-quantization-all-you-need-to-know&v=XTE0oS7b6fM] #AI #LLM #Quantization #TechBeatsLive #LocalLlama #vLLM #Ollama
10 episodios
Comentarios
0Sé la primera persona en comentar
¡Regístrate ahora y únete a la comunidad de Tech Beats Unplugged!