Claude Sonnet 5 is HERE!

9 min · Gestern

Beschreibung

Claude Sonnet 5 Review: More Expensive, Worse Than Opus 4.8? (Benchmarks & Agent Tests)The video reviews Anthropic’s newly released Claude Sonnet 5, described as more agentic and capable of planning and tool use, but argues it underperforms Opus 4.8 on benchmarks (including agentic coding) while costing more. The creator shares Goldy Bench examples Sonnet 5 generated (a ray caster maze, a broken galaxy orbit test, a synthwave background, and a crypt game), noting some outputs look good but others fail. Side-by-side comparisons show mixed results versus GLM 5.2, with GLM succeeding on tasks Sonnet 5 fails, and tweets highlight negative reception focused on poor token efficiency and pricing. The recommendation is to keep using Opus 4.8, expect Fable 5 soon, and focus on building flexible agent systems that can swap models in and out.00:00 [https://www.youtube.com/watch?v=1Wl-4D6D5rw] Sonnet 5 Launch00:30 [https://www.youtube.com/watch?v=1Wl-4D6D5rw&t=30s] Benchmarks vs Opus01:39 [https://www.youtube.com/watch?v=1Wl-4D6D5rw&t=99s] Goldy Bench Demos02:53 [https://www.youtube.com/watch?v=1Wl-4D6D5rw&t=173s] GLM 5.2 Comparisons04:00 [https://www.youtube.com/watch?v=1Wl-4D6D5rw&t=240s] Backlash and Pricing05:57 [https://www.youtube.com/watch?v=1Wl-4D6D5rw&t=357s] Fugu Ultra Showdown07:20 [https://www.youtube.com/watch?v=1Wl-4D6D5rw&t=440s] Why Release This08:00 [https://www.youtube.com/watch?v=1Wl-4D6D5rw&t=480s] Focus on Systems09:11 [https://www.youtube.com/watch?v=1Wl-4D6D5rw&t=551s] Agent OS Pitch09:48 [https://www.youtube.com/watch?v=1Wl-4D6D5rw&t=588s] Final Verdict

Kommentare

Sei die erste Person, die kommentiert

Melde dich jetzt an und werde Teil der AI News Today | Julian Goldie Podcast-Community!

Loslegen

Alle Folgen

535 Folgen

Agent OS + Obsidian + Free APIs + Agent Teams!

Agent OS Updates + Community Q&A: Hermes, GLM 5.2, Memory, SEO Pipeline, and Model PicksThis episode answers recent community questions about the Agent Operating System (Agent OS), which centralizes and orchestrates multiple AI agents in one place. It covers recent updates including a Hermes lead generation tool, Mixture of Agents testing, an auto-updating memory system using Obsidian, a new GLM Code section to use GLM 5.2 with agent harnesses like Claude Code, NotebookLM short video generation and research import, an expanded SEO content pipeline with OpenSEO, and plans to add Fable 5 as a default CLI when restored. The host advises staying focused with a “Focus Protocol,” recommends Agent OS for managing multiple tools, shares guidance on Docker and GitHub/data concerns, compares models (preferring Opus 4.8, GLM 5.2 over Sonnet 5), suggests SEO stacks for local businesses, and highlights community wins, customization examples, and how to join AI Profit Bomb for training, support, and the full Agent OS.00:00 [https://www.youtube.com/watch?v=nm4xNnbSI14] Agent OS Updates01:51 [https://www.youtube.com/watch?v=nm4xNnbSI14&t=111s] Focus Protocol03:21 [https://www.youtube.com/watch?v=nm4xNnbSI14&t=201s] Why Use Agent OS04:49 [https://www.youtube.com/watch?v=nm4xNnbSI14&t=289s] Docker and GitHub06:47 [https://www.youtube.com/watch?v=nm4xNnbSI14&t=407s] Model News and Picks07:52 [https://www.youtube.com/watch?v=nm4xNnbSI14&t=472s] SEO Side Gig Stack08:58 [https://www.youtube.com/watch?v=nm4xNnbSI14&t=538s] Cheaper Models Setup09:32 [https://www.youtube.com/watch?v=nm4xNnbSI14&t=572s] Community Wins Workflows11:45 [https://www.youtube.com/watch?v=nm4xNnbSI14&t=705s] Free Models OwlAlpha12:33 [https://www.youtube.com/watch?v=nm4xNnbSI14&t=753s] Themes and Customization13:41 [https://www.youtube.com/watch?v=nm4xNnbSI14&t=821s] Best Memory System14:48 [https://www.youtube.com/watch?v=nm4xNnbSI14&t=888s] Kanban Orchestration15:44 [https://www.youtube.com/watch?v=nm4xNnbSI14&t=944s] Ollama vs Hermes16:27 [https://www.youtube.com/watch?v=nm4xNnbSI14&t=987s] More Memory Advice17:22 [https://www.youtube.com/watch?v=nm4xNnbSI14&t=1042s] Custom Desks Example18:30 [https://www.youtube.com/watch?v=nm4xNnbSI14&t=1110s] Join the Community

2. Juli 202620 min

China’s NEW Meituan LongCat 2.0 Tested!

LongCat 2.0 (Open Source) Tested: Benchmarks, Games, and GLM 5.2 ComparisonThe episode covers the official release of LongCat 2.0, an open-source Chinese agentic model revealed as the model behind the AoAlpha free API, with features like Sparse Attention, Zero Compute Experts, and MIPD. The host reviews benchmark claims (including Terminal Bench 2.1 and SWE-Bench Pro comparisons versus GPT-5.5 and Opus 4.8) and shares hands-on tests building game demos such as Dragon Realm, a Skyrim-style open world, and VoxelCraft, noting mixed results and frequent bugs. Access issues are mentioned, including difficulty using the API without a Chinese setup, so the model is tested via the website chat. A key point is that LongCat was trained on China’s Meituan chips without NVIDIA. Overall, GLM 5.2 is judged stronger in side-by-side game benchmarks, and the host promotes the AI Profit Boardroom and Agent OS setup.00:00 [https://www.youtube.com/watch?v=60es_aKUcBU] LongCat 2.0 Launch00:36 [https://www.youtube.com/watch?v=60es_aKUcBU&t=36s] Benchmarks and API Hurdles01:38 [https://www.youtube.com/watch?v=60es_aKUcBU&t=98s] Game Demos Dragon Realm02:23 [https://www.youtube.com/watch?v=60es_aKUcBU&t=143s] Goldy Bench Verdict02:43 [https://www.youtube.com/watch?v=60es_aKUcBU&t=163s] Trained Without NVIDIA03:32 [https://www.youtube.com/watch?v=60es_aKUcBU&t=212s] How to Use It03:51 [https://www.youtube.com/watch?v=60es_aKUcBU&t=231s] Eval Results vs GPT04:17 [https://www.youtube.com/watch?v=60es_aKUcBU&t=257s] GLM 5.2 Showdown06:13 [https://www.youtube.com/watch?v=60es_aKUcBU&t=373s] Final Take and Recommendation06:35 [https://www.youtube.com/watch?v=60es_aKUcBU&t=395s] Agent OS and Boardroom Plug07:37 [https://www.youtube.com/watch?v=60es_aKUcBU&t=457s] Wrap Up

2. Juli 20267 min

New NotebookLM Video Update is INSANE!

NotebookLM Just Added 60-Second Vertical AI Video Overviews (Coming Free Soon)The script covers NotebookLM’s new feature for generating 60-second vertical short video overviews, now rolling out to Google AI Ultra and Pro subscribers and expected to reach free users soon. The creator demonstrates examples and explains that each video is generated from a specific NotebookLM notebook’s research, producing AI images, voiceover, and editing in a hands-off workflow, especially when connected via MCP to an agent operating system. They compare the short-video outputs with longer NotebookLM videos (more slideshow-like) and with alternatives like Open Montage (more cinematic) and a separate Video Agent (preferred for educational videos). Despite video quality being below human-made content, they highlight NotebookLM’s strength as a research-and-learning tool and its one-click outputs (audio, videos, slide decks, mind maps, infographics, flashcards, quizzes, tables, reports). The episode ends by promoting the AI Profit Boardroom for setup guides, trainings, and coaching.00:00 [https://www.youtube.com/watch?v=j766Vhvv8Lo] NotebookLM Shorts Update00:46 [https://www.youtube.com/watch?v=j766Vhvv8Lo&t=46s] What The Shorts Look Like01:24 [https://www.youtube.com/watch?v=j766Vhvv8Lo&t=84s] Inside Agent OS Integration02:42 [https://www.youtube.com/watch?v=j766Vhvv8Lo&t=162s] Quality Check And Tradeoffs03:00 [https://www.youtube.com/watch?v=j766Vhvv8Lo&t=180s] OpenMontage Comparison04:24 [https://www.youtube.com/watch?v=j766Vhvv8Lo&t=264s] Video Agent Alternative04:54 [https://www.youtube.com/watch?v=j766Vhvv8Lo&t=294s] NotebookLM One Click Content Suite05:56 [https://www.youtube.com/watch?v=j766Vhvv8Lo&t=356s] Shorts Vs Long Form Videos06:45 [https://www.youtube.com/watch?v=j766Vhvv8Lo&t=405s] Learning And Speed Benefits08:09 [https://www.youtube.com/watch?v=j766Vhvv8Lo&t=489s] Which Tool To Choose08:49 [https://www.youtube.com/watch?v=j766Vhvv8Lo&t=529s] Join AI Profit Boardroom09:27 [https://www.youtube.com/watch?v=j766Vhvv8Lo&t=567s] Community Training And Wrap Up

2. Juli 202610 min

Claude Sonnet 5 VS GLM 5.2: Who Wins?

Claude Sonnet 5 vs GLM 5.2: One-Shot Coding Showdown (Games, Benchmarks, Cost & Agent OS)The video compares Claude Sonnet 5 and GLM 5.2 side by side using one-shot builds (dungeon crawler, raycaster maze, multiple games, a website UI, and a Web OS), noting Sonnet 5 sometimes looks smoother but often feels basic or lacks gameplay, while GLM 5.2 is frequently more interesting and polished, though it can be buggy in some tests where Sonnet 5 wins. It also reviews benchmarks (CursorBench and GaudiBench/Goldy Bench), stating Sonnet 5 scores higher than GLM 5.2 on CursorBench but Opus 4.8 outperforms Sonnet 5, and “Fable 5” leads overall and is expected to return within 24 hours. The creator highlights pricing differences (Sonnet 5 far more expensive than GLM 5.2), GLM 5.2’s open-source and OAuth/agentic compatibility, demonstrates plugging GLM into Claude Code via Agent OS, and promotes the Agent OS and AI Profit Boarding community with tutorials, coaching calls, and support.00:00 [https://www.youtube.com/watch?v=uHutsCe2HFA] One Shot Showdown00:18 [https://www.youtube.com/watch?v=uHutsCe2HFA&t=18s] Dungeon Crawler Test01:07 [https://www.youtube.com/watch?v=uHutsCe2HFA&t=67s] Raycaster Maze Faceoff01:52 [https://www.youtube.com/watch?v=uHutsCe2HFA&t=112s] CursorBench Rankings02:28 [https://www.youtube.com/watch?v=uHutsCe2HFA&t=148s] Opus vs Sonnet vs Fable03:05 [https://www.youtube.com/watch?v=uHutsCe2HFA&t=185s] Pricing and Agent OS Setup04:05 [https://www.youtube.com/watch?v=uHutsCe2HFA&t=245s] GaudiBench and Context Window04:53 [https://www.youtube.com/watch?v=uHutsCe2HFA&t=293s] More Builds Mixed Results07:11 [https://www.youtube.com/watch?v=uHutsCe2HFA&t=431s] Games and Visual Quality07:59 [https://www.youtube.com/watch?v=uHutsCe2HFA&t=479s] Web UI and Web OS09:08 [https://www.youtube.com/watch?v=uHutsCe2HFA&t=548s] Final Verdict and Leaderboards10:19 [https://www.youtube.com/watch?v=uHutsCe2HFA&t=619s] Dont Chase Models10:34 [https://www.youtube.com/watch?v=uHutsCe2HFA&t=634s] Agent OS Offer and Wrap Up

Gestern11 min

Fable 5 is BACK!

Fable 5 Is Coming Back Tomorrow: Export Controls Lifted, Global Access Returns (Plus New Safeguards)Anthropic announced that Claude Fable 5 will be redeployed globally starting July 1 after U.S. export controls imposed June 12 shut down access to Fable 5 and Mythos 5 for everyone due to immediate compliance needs. The controls were lifted June 30, with Fable 5 returning across Claude Platform, Claude AI, Claude Code, and Claude Cowork for Pro, Max, Team, and select Enterprise plans, included for up to 50% of weekly usage limits before shifting to paid usage credits after July 7. Mythos 5 access is being restored only to certain U.S. organizations via Project Glasswing following June 26 approval. The shutdown followed a report that Amazon researchers found a method to bypass Fable 5 safeguards; Anthropic says the bypass is now blocked in over 99% of cases and that U.S. Commerce testing agreed the new safeguards are extraordinarily strong, alongside plans for deeper government collaboration and pre-release evaluations.00:00 [https://www.youtube.com/watch?v=Ao0oaaD4dkc] Fable 5 Returns00:45 [https://www.youtube.com/watch?v=Ao0oaaD4dkc&t=45s] Global Rollout Details01:13 [https://www.youtube.com/watch?v=Ao0oaaD4dkc&t=73s] Usage Limits and Credits01:54 [https://www.youtube.com/watch?v=Ao0oaaD4dkc&t=114s] Mythos 5 Partial Restore02:32 [https://www.youtube.com/watch?v=Ao0oaaD4dkc&t=152s] Why It Was Shut Down03:46 [https://www.youtube.com/watch?v=Ao0oaaD4dkc&t=226s] Timeline and Big Picture06:19 [https://www.youtube.com/watch?v=Ao0oaaD4dkc&t=379s] Amazon Bypass Explained08:06 [https://www.youtube.com/watch?v=Ao0oaaD4dkc&t=486s] New Safeguards Breakdown09:11 [https://www.youtube.com/watch?v=Ao0oaaD4dkc&t=551s] Industry and Government Frameworks10:05 [https://www.youtube.com/watch?v=Ao0oaaD4dkc&t=605s] Wrap Up and Community Plug

Gestern11 min

Claude Sonnet 5 is HERE!

Beschreibung

Kommentare

2 Monate für 1 €

Alle Folgen