AgentStack Daily

Episode 64: Claude Code 2.1.165, Microsoft's MAI Coding Model Family, and the Agent Infrastructure Project Radar

38 min · I går
episode Episode 64: Claude Code 2.1.165, Microsoft's MAI Coding Model Family, and the Agent Infrastructure Project Radar cover

Beskrivelse

Claude Code 2.1.165 is the latest npm `latest` as of June 5, following 2.1.163 and 2.1.164 — all bug-fix and reliability releases that clean up background sessions, plugin hooks, skill syntax, and Windows path handling. Microsoft dropped a seven-model MAI family at Build 2026 on June 2, with MAI-Code-1-Flash as the headline: a 5B-parameter coding model trained on GitHub Copilot production harnesses, scoring 51% on SWE-Bench Pro and 60% leaner on tokens than comparable models. The episode also covers the GitHub Project Radar around agent memory, code graphs, and MCP tooling that serve the local coding-agent stack. Show notes: https://tobyonfitnesstech.com/podcasts/episode-64/

Kommentarer

0

Vær den første til å kommentere

Registrer deg nå og bli medlem av AgentStack Daily sitt community!

Prøv gratis

Prøv gratis i 14 dager

99 kr / Måned etter prøveperioden. · Avslutt når som helst.

  • Eksklusive podkaster
  • 20 timer lydbøker i måneden
  • Gratis podkaster

Alle episoder

66 Episoder

episode Episode 64: Claude Code 2.1.165, Microsoft's MAI Coding Model Family, and the Agent Infrastructure Project Radar cover

Episode 64: Claude Code 2.1.165, Microsoft's MAI Coding Model Family, and the Agent Infrastructure Project Radar

Claude Code 2.1.165 is the latest npm `latest` as of June 5, following 2.1.163 and 2.1.164 — all bug-fix and reliability releases that clean up background sessions, plugin hooks, skill syntax, and Windows path handling. Microsoft dropped a seven-model MAI family at Build 2026 on June 2, with MAI-Code-1-Flash as the headline: a 5B-parameter coding model trained on GitHub Copilot production harnesses, scoring 51% on SWE-Bench Pro and 60% leaner on tokens than comparable models. The episode also covers the GitHub Project Radar around agent memory, code graphs, and MCP tooling that serve the local coding-agent stack. Show notes: https://tobyonfitnesstech.com/podcasts/episode-64/

I går38 min
episode Episode 63: OpenClaw 2026.6.1, Claude Code 2.1.162, Qwen 3.7 Max/Plus, and Agent Memory Infrastructure cover

Episode 63: OpenClaw 2026.6.1, Claude Code 2.1.162, Qwen 3.7 Max/Plus, and Agent Memory Infrastructure

[00:00] Episode hook OpenClaw v2026.6.1, Hermes Agent v2026.5.29.2, and Claude Code 2.1.162 drop in the same episode window. The stable OpenClaw tag is v2026.6.1, the Hermes stable tag stays at v2026.5.29.2, and the latest Claude Code npm `latest` is 2.1.162. OpenClaw v2026.6.1 ships Workboard orchestration, a governed Skill Workshop, SQLite-backed state recovery, and MiniMax M3 provider support. Claude Code 2.1.162 adds waiting-for visibility in `claude agents --json` and a batch of permission and interrupt fixes across five releases from 2.1.158 to 2.1.162. Qwen 3.7 Max and Plus split the coding-reasoning and multimodal-vision lanes. agentmemory makes every agent on your machine share a persistent context layer. This is a 60-minute episode — keep the existing builder stories and extend runtime. Show notes: https://tobyonfitnesstech.com/podcasts/episode-63/

4. juni 202631 min
episode Episode 62: Codex 0.136, Stanford's Agent Guidelines, AWS OpenAI, and GPU Efficiency cover

Episode 62: Codex 0.136, Stanford's Agent Guidelines, AWS OpenAI, and GPU Efficiency

AgentStack Daily EP062 leads with Codex `rust-v0.136.0`: better TUI diagnostics and error context, improved app-server lifecycle handling, named hooks and permission scopes, Python SDK and Node SDK improvements, and non-interactive installation support. Stanford's CS336 course publishes a formal AI agent guidelines document that reaches 1,863 stars in under 24 hours — institutional validation that agent workflow guidelines are becoming a first-class engineering concern. OpenAI puts GPT-4.5, o3, and Codex on AWS Bedrock, completing the pattern where both major labs distribute through the same cloud. Expanse from YC P26 uses cluster-specific fine-tuned models to predict GPU job resource needs and outperforms frontier LLMs by 8x on that task, backed by real HPC telemetry and SLURM/Kubernetes integration. The project radar covers agent OS for hardware, terminal context managers, MCP workflow templates, and physical agent scheduling. Show notes: https://tobyonfitnesstech.com/podcasts/episode-62/

3. juni 202646 min
episode Episode 61: OpenClaw 2026.5.28, MiniMax M3, Claude Code 2.1.159, and Code Graphs cover

Episode 61: OpenClaw 2026.5.28, MiniMax M3, Claude Code 2.1.159, and Code Graphs

OpenClaw `v2026.5.28` leads with steadier agent and Codex runtime recovery, safer channel delivery, stricter browser and automation inputs, provider and media expansion, externalized plugin surfaces, and bounded release proof. MiniMax M3 follows as the model-discovery headline: an open-weight-positioned coding and agent model with MSA sparse attention, up to 1M context, native multimodality, API availability, MiniMax Code integration, and open weights promised after the launch window. Claude Code `2.1.159` is the quiet CLI hygiene lane. Then EP061 evaluates Understand Anything, agentgateway, MCPJungle, and CodeAlmanac as practical tools for codebase graphs, agent traffic control, MCP management, and persistent repo context. Show notes: https://tobyonfitnesstech.com/podcasts/episode-61/

2. juni 202639 min
episode Episode 60: Claude Code 2.1.158, Codex Windows Control, Runtime Instructions, and Local Agent Memory cover

Episode 60: Claude Code 2.1.158, Codex Windows Control, Runtime Instructions, and Local Agent Memory

Claude Code `2.1.158` on npm `latest` adds auto mode for Bedrock, Vertex, and Foundry when `CLAUDE_CODE_ENABLE_AUTO_MODE=1` is set. OpenAI's May 29 Codex update adds Windows computer use in the Codex app, remote control from mobile or Mac while the Windows host keeps project files and local context, in-app browser infrastructure improvements, and Codex Profiles for identity, activity, usage stats, and token activity. The episode then covers Anthropic's Messages API system entries as a runtime-control surface, plus OpenLore, Mnemo, OpenMonoAgent, and Prometheus as project-radar examples of local architectural memory, persistent agent cognition, local-only coding agents, and graph-backed repair loops. Show notes: https://tobyonfitnesstech.com/podcasts/episode-60/

31. mai 202646 min