EP252: How batch sizes sharpen AI reasoning

21 min · I går

Description

Title: How Should LLMs Consume High-Quality Data? Optimal Data Scheduling via Quality-Aware Functional Scaling Laws Source: http://arxiv.org/abs/2605.25698v1 Summary: This paper establishes foundational quality-aware functional scaling laws that provide the first theoretical closed-form solution for scheduling high-quality data during LLM training. The introduced 'Drop-Stable-Rampup' schedule optimizes training dynamics across noise-limited and signal-limited regimes, yielding significant breakthroughs in mathematical reasoning performance.

Comments

Be the first to comment

Get Started

All episodes

253 episodes

EP253: MACA optimizes AI agent coordination

Title: Multi-Agent Coordination Adaptation via Structure-Guided Orchestration Source: http://arxiv.org/abs/2605.25746v1 Summary: The paper introduces a novel probabilistic framework for multi-agent coordination by casting orchestration as posterior inference over task-specific structures. This foundational approach balances structural stability with dynamic adaptability, significantly improving execution efficiency and performance while reducing token overhead in complex agentic workflows.

Yesterday20 min

EP252: How batch sizes sharpen AI reasoning

Yesterday21 min

EP251: How SR2AM stops AI overthinking

Title: Efficient Agentic Reasoning Through Self-Regulated Simulative Planning Source: http://arxiv.org/abs/2605.22138v1 Summary: This paper introduces a foundational three-system reasoning framework—comprising reactive, simulative, and self-regulated components—that enables agents to autonomously manage their planning depth and horizon. By treating the LLM as a world model for future-state prediction, it demonstrates that structured deliberation can allow smaller models to match the performance of systems orders of magnitude larger.

16. juni 202619 min

EP250: Compiling agent workflows into model weights

Title: Compiling Agentic Workflows into LLM Weights: Near-Frontier Quality at Two Orders of Magnitude Less Cost Source: http://arxiv.org/abs/2605.22502v1 Summary: This work proposes the 'subterranean agent' paradigm, which replaces external orchestration frameworks by compiling agentic workflows directly into the model's weights via fine-tuning. This foundational shift addresses the cost and latency bottlenecks of frontier-model prompting while providing a more efficient and private alternative for procedural task execution.

16. juni 202619 min

EP249: Mem-pi fixes AI amnesia with generative memory

Title: Mem-π: Adaptive Memory through Learning When and What to Generate Source: http://arxiv.org/abs/2605.21463v1 Summary: Mem-π presents a foundational shift in agent memory architectures by replacing static similarity-based retrieval with a dedicated generative model that produces context-specific guidance. This framework enables agents to dynamically adapt their memory usage, leading to substantial improvements in complex reasoning and long-horizon task execution.

15. juni 202622 min

EP252: How batch sizes sharpen AI reasoning

Description

Comments

1 month for 9 kr.

All episodes