NEW GLM 5.2 BEATS Claude?

9 min · 16 de jun de 2026

Descripción

GLM 52 vs Qwen 37 Max vs Claude Opus 48: Real-World Tests vs Benchmarks (No Second Chances)The episode compares GLM 52 (ZAI), Qwen 37 Max (Alibaba), and Claude Opus 48 (Anthropic) head-to-head on five one-shot tasks, arguing that benchmark rankings didn’t match real usability. In coding-focused tests like a voxel runner game, a liquid-in-a-bowl animation, a business landing page, and an arcade game, GLM 52 produced the most fun, polished, and feature-rich results, while Claude’s outputs were often basic and Qwen’s were sometimes buggy or incomplete; Claude clearly won the solar-system orbit map task. The script also notes Qwen’s strong reported benchmarks and faster replies, GLM’s slower responses in agents but strong CLI coding, and highlights limitations integrating Claude into agent workflows compared to Qwen/GLM in Hermes and the creator’s agent operating system.00:00 [https://www.youtube.com/watch?v=MnFwz8O3F-U] Head To Head Setup01:27 [https://www.youtube.com/watch?v=MnFwz8O3F-U&t=87s] Coding Tests Results04:09 [https://www.youtube.com/watch?v=MnFwz8O3F-U&t=249s] Arcade Game Showdown04:50 [https://www.youtube.com/watch?v=MnFwz8O3F-U&t=290s] Benchmarks Versus Reality06:01 [https://www.youtube.com/watch?v=MnFwz8O3F-U&t=361s] Agents Workflow Tradeoffs07:59 [https://www.youtube.com/watch?v=MnFwz8O3F-U&t=479s] Final Recommendations

Comentarios

Sé la primera persona en comentar

¡Regístrate ahora y únete a la comunidad de AI News Today | Julian Goldie Podcast!

Prueba gratis

Todos los episodios

496 episodios

Sakana: BETTER Than Fable 5?

Sakana Fugu Ultra vs Fusion vs Claude Opus 4.8 (42 Builds + Goldie Bench): Is It Fable 5-Level?The video reviews Sakana Fugu Ultra after several days of release, testing whether it matches “Fable 5-level intelligence” claims by running 42 builds using the same prompts across Fugu Ultra, Fusion (OpenRouter), and Claude Opus 4.8, plus Goldie Bench leaderboard comparisons. The script explains Fugu as a multi-agent orchestration system that fuses outputs from closed and open models, contrasting it with Fusion’s pay-per-token API pricing versus Fugu’s flat subscription with token limits and frequent lockouts, slow responses, and one-shot workflows. Side-by-side build demos (solar system, runner, fireworks, dungeon crawler, open world) show mixed results but generally stronger outputs from Fugu Ultra, with Fusion often close or better on some examples and Opus 4.8 frequently failing. Benchmarks cited include Fugu leading LiveCode Bench but trailing Fable 5 on SWE Bench Pro and Terminal Bench, and a brief comparison of Fugu Ultra vs cheaper Fugu Mini. The episode ends by promoting the AI Profit Boardroom agent operating system integrating Sakana, Fusion, Claude, and other tools.00:00 [https://www.youtube.com/watch?v=uhcWUS4f0vc] Fugu Ultra Overview00:45 [https://www.youtube.com/watch?v=uhcWUS4f0vc&t=45s] Testing Setup and Limits01:26 [https://www.youtube.com/watch?v=uhcWUS4f0vc&t=86s] Models and Method02:02 [https://www.youtube.com/watch?v=uhcWUS4f0vc&t=122s] What Is Fugu03:00 [https://www.youtube.com/watch?v=uhcWUS4f0vc&t=180s] Leaderboard and Pricing04:12 [https://www.youtube.com/watch?v=uhcWUS4f0vc&t=252s] Build Demos Part 106:21 [https://www.youtube.com/watch?v=uhcWUS4f0vc&t=381s] Build Demos Part 208:33 [https://www.youtube.com/watch?v=uhcWUS4f0vc&t=513s] Open World Comparison09:23 [https://www.youtube.com/watch?v=uhcWUS4f0vc&t=563s] Benchmark Breakdown10:16 [https://www.youtube.com/watch?v=uhcWUS4f0vc&t=616s] Fugu Mini vs Ultra12:03 [https://www.youtube.com/watch?v=uhcWUS4f0vc&t=723s] Final Verdict12:13 [https://www.youtube.com/watch?v=uhcWUS4f0vc&t=733s] Boardroom and Outro

Ayer14 min

New Fable 5 LEAKS: Coming Back Soon?

Fable 5 Returning Soon? New Claude Code, Bedrock Catalog & AWS Doc Leaks ExplainedThe episode reviews recent leaks suggesting Anthropic’s Claude Fable 5 may return after being released on June 9 and pulled three days later, with new signals appearing June 24–25. It cites three “checkable trails” moving in the same timeframe: Claude Code v2.1.190 string changes (removing “purchased separately” and adding a weekly-usage message that implies subscription inclusion), Fable 5 reappearing as a live listing in the Amazon Bedrock model catalog, and AWS Bedrock docs/model cards showing the model lifecycle as active with Bedrock IDs for both “Anthropic Claude Fable 5” and “global” versions. The host notes nothing is confirmed, raises the question of US-only availability, and estimates a possible return by early July/within seven days. The video ends by emphasizing building flexible systems (Agent OS) that can swap models quickly and promoting the AI Profit Boardroom where updates, guides, coaching calls, and community support are provided.00:00 [https://www.youtube.com/watch?v=zqEOFTdXqdI] Fable 5 Return Rumors00:44 [https://www.youtube.com/watch?v=zqEOFTdXqdI&t=44s] What We Know So Far01:24 [https://www.youtube.com/watch?v=zqEOFTdXqdI&t=84s] Timeline of Events02:38 [https://www.youtube.com/watch?v=zqEOFTdXqdI&t=158s] Claude Code String Clues03:21 [https://www.youtube.com/watch?v=zqEOFTdXqdI&t=201s] Subscription Plan Implications03:56 [https://www.youtube.com/watch?v=zqEOFTdXqdI&t=236s] Amazon Bedrock Listing05:13 [https://www.youtube.com/watch?v=zqEOFTdXqdI&t=313s] AWS Docs Model Cards05:44 [https://www.youtube.com/watch?v=zqEOFTdXqdI&t=344s] Regions and Release Window06:25 [https://www.youtube.com/watch?v=zqEOFTdXqdI&t=385s] Build Systems Not Models07:23 [https://www.youtube.com/watch?v=zqEOFTdXqdI&t=443s] Agent OS and Boardroom Pitch07:57 [https://www.youtube.com/watch?v=zqEOFTdXqdI&t=477s] Wrap Up and Link

Ayer8 min

Hermes OS is INSANE! 🤯

Inside My Hermes Agent OS: Oracle, Jarvis Voice Control, Outreach SaaS, Memory Galaxy & One‑Click Content AutomationThe script demos a custom “Agent OS” built around Hermes Agent, showing how it segments multiple agents and workflows into one system. It highlights Hermes Oracle for pulling trending news and generating social posts or SEO-optimized WordPress articles in one click with scheduled daily refreshes; Hermes Jarvis, a voice-activated mode that can run real-time actions like opening websites and providing daily briefings; and a new Outreach Agent that finds leads, enriches/validates emails, manages campaigns, inbox/sent items, dashboards, and suggested outreach sequences using API keys and sending caps. It also covers goal mode with autonomous looping and QC, a connected Memory Galaxy for logged/interlinked knowledge, a Kanban board for multi-profile agent orchestration, NotebookLM asset syncing via MCP, video/SEO engines, Paperclip for team-based agent orgs, and an idea-to-implementation pipeline shared via the AI Profit Boardroom with tutorials, updates, community support, coaching calls, and testimonials.00:00 [https://www.youtube.com/watch?v=R6cuigDZKCM] Agent OS Overview00:18 [https://www.youtube.com/watch?v=R6cuigDZKCM&t=18s] Hermes Oracle News02:02 [https://www.youtube.com/watch?v=R6cuigDZKCM&t=122s] Jarvis Voice Control03:06 [https://www.youtube.com/watch?v=R6cuigDZKCM&t=186s] Outreach Email Engine05:08 [https://www.youtube.com/watch?v=R6cuigDZKCM&t=308s] Goal Mode Autonomy05:41 [https://www.youtube.com/watch?v=R6cuigDZKCM&t=341s] Memory Galaxy Brain06:23 [https://www.youtube.com/watch?v=R6cuigDZKCM&t=383s] Kanban Team Workflow07:06 [https://www.youtube.com/watch?v=R6cuigDZKCM&t=426s] NotebookLM Asset Sync07:37 [https://www.youtube.com/watch?v=R6cuigDZKCM&t=457s] Video SEO Loop Tools08:35 [https://www.youtube.com/watch?v=R6cuigDZKCM&t=515s] MCP Workspace Studio08:59 [https://www.youtube.com/watch?v=R6cuigDZKCM&t=539s] Paperclip Agent Orgs09:22 [https://www.youtube.com/watch?v=R6cuigDZKCM&t=562s] Idea To Shipping09:43 [https://www.youtube.com/watch?v=R6cuigDZKCM&t=583s] Get The Setup10:23 [https://www.youtube.com/watch?v=R6cuigDZKCM&t=623s] Boardroom Tutorials Support11:17 [https://www.youtube.com/watch?v=R6cuigDZKCM&t=677s] Build Vs Customize11:41 [https://www.youtube.com/watch?v=R6cuigDZKCM&t=701s] Testimonials And Wrap

Ayer12 min

Hermes OS is INSANE! 🤯

Hermes Lead Machine: 1-Click Lead Gen + Enriched Outreach Pipeline (Find, Verify, Score, Write, Send)Julian demonstrates the experimental “Hermes Lead Machine,” a Hermes Agent workflow that generates and enriches leads in one click using the Hunter API. Users can either paste an existing list to enrich or describe the exact prospects they want in plain English (e.g., SEO agencies interested in link-building) and the system finds companies and people, captures domains and emails, enriches company details, verifies deliverability to reduce bounces, scores and filters leads (0–100) for fit, and segments them by status (new, enriched, valid, contacted, replied). It then writes personalized openers and outreach emails and can send through a dedicated inbox, all managed in a dashboard that feels like a SaaS tool. He contrasts this with manual, duct-taped outreach and explains setup via Agent OS/AppHub, API key, and email configuration, plus community support and coaching calls.00:00 [https://www.youtube.com/watch?v=y9nJ7yt10_k] Hermes Lead Machine Intro00:13 [https://www.youtube.com/watch?v=y9nJ7yt10_k&t=13s] Lead Finder Dashboard Tour00:38 [https://www.youtube.com/watch?v=y9nJ7yt10_k&t=38s] Hunter API Find and Enrich Demo01:30 [https://www.youtube.com/watch?v=y9nJ7yt10_k&t=90s] Segmentation and Workflow Benefits02:35 [https://www.youtube.com/watch?v=y9nJ7yt10_k&t=155s] Find Verify Write Pipeline03:05 [https://www.youtube.com/watch?v=y9nJ7yt10_k&t=185s] Old Outreach vs New Automation04:26 [https://www.youtube.com/watch?v=y9nJ7yt10_k&t=266s] Why Anyone Can Use It05:07 [https://www.youtube.com/watch?v=y9nJ7yt10_k&t=307s] Six Parts Explained07:21 [https://www.youtube.com/watch?v=y9nJ7yt10_k&t=441s] Setup and Common Questions08:12 [https://www.youtube.com/watch?v=y9nJ7yt10_k&t=492s] Agent OS Offer and Community09:25 [https://www.youtube.com/watch?v=y9nJ7yt10_k&t=565s] Wrap Up and Next Steps

Ayer9 min

Hermes Agent: How to Automate Lead Generation!

Hermes Agent Sidekick Pets Update: Setup, Commands, and 3,000+ Pixel CompanionsHermes Agent has introduced the Sidekick system, adding an animated pixel pet that reflects an agent’s status at a glance—idle, thinking, running tools, waiting, done, or failed—across the CLI, TUI, desktop app, and the Agent OS dashboard. The pet acts as a personality-driven status indicator to improve peripheral awareness versus spinners and log lines, helping users quickly notice completion or failures. Setup involves updating Hermes, browsing the Pet Dex gallery with `Hermes pets list`, installing a pet (e.g., `Hermes pets install Boba select`), adjusting size with `Hermes pet scale`, swapping anytime, or disabling instantly with `Hermes pets off`; users can choose from nearly 3,000 pets or submit their own, and the sprite cannot affect code or files. The episode also promotes the AI Profit Boardroom for the full Agent OS, additional agents (Oracle, Jarvis, outreach/lead gen), trainings, coaching calls, and support.00:00 [https://www.youtube.com/watch?v=WwD3bFZEr9k] Meet the Sidekick Pet00:45 [https://www.youtube.com/watch?v=WwD3bFZEr9k&t=45s] How Pet Poses Work01:41 [https://www.youtube.com/watch?v=WwD3bFZEr9k&t=101s] Why It Matters02:04 [https://www.youtube.com/watch?v=WwD3bFZEr9k&t=124s] Where It Shows Up02:45 [https://www.youtube.com/watch?v=WwD3bFZEr9k&t=165s] Update and Install03:29 [https://www.youtube.com/watch?v=WwD3bFZEr9k&t=209s] Customize or Disable04:01 [https://www.youtube.com/watch?v=WwD3bFZEr9k&t=241s] Gimmick or Useful05:12 [https://www.youtube.com/watch?v=WwD3bFZEr9k&t=312s] Quick Recap05:43 [https://www.youtube.com/watch?v=WwD3bFZEr9k&t=343s] Get the Full Agent OS

Ayer7 min

NEW GLM 5.2 BEATS Claude?

Descripción

Comentarios

Empieza 7 días de prueba

Todos los episodios