EP284: Compressing massive context into soft tokens

22 min · I går

Description

Title: End-to-End Context Compression at Scale Source: http://arxiv.org/abs/2606.09659v1 Summary: This paper introduces Latent Context Language Models (LCLMs), a novel architectural primitive that utilizes encoder-decoder compression to efficiently handle long-context sequences at scale. It establishes a new Pareto frontier for accuracy and efficiency, providing a foundational backbone for next-generation agents that require massive context windows.

Comments

Be the first to comment

Get Started

All episodes

285 episodes

EP284: Compressing massive context into soft tokens

Yesterday22 min

EP283: Aligning AI planners with tool capabilities

Title: Capability-Aligned Hierarchical Learning for Tool-Augmented LLMs Source: http://arxiv.org/abs/2606.09371v1 Summary: This paper proposes Capability-Aligned Hierarchical Learning (CAHL), a novel framework that jointly optimizes high-level planning and low-level execution policies using reinforcement learning. It addresses the fundamental bottleneck of planner-executor misalignment, creating a more robust and foundational reasoning loop for tool-augmented agentic systems.

Yesterday21 min

EP282: AI gladiators training in shopping arenas

Title: Bittensor Agent Arenas as a Trajectory Primitive: Distilling a Shopping Agent from ShoppingBench Subnet Traces Source: http://arxiv.org/abs/2606.10064v1 Summary: This paper introduces the concept of Agent Arenas as a "trajectory primitive," establishing a novel framework for generating diverse, incentive-aligned training data for agentic post-training. This approach represents a significant breakthrough in scaling agent capabilities by moving beyond the limitations of synthetic data and unjudged production logs.

2. juli 202622 min

EP281: Restoring plasticity to over-trained AI

Title: When RL Fails after SFT: Rejuvenating Model Plasticity for Robust SFT-to-RL Handoff Source: http://arxiv.org/abs/2606.09932v1 Summary: This paper identifies and solves the critical 'loss of plasticity' bottleneck in the standard LLM post-training pipeline where excessive SFT inhibits subsequent RL optimization. It introduces 'Rejuvenation', a foundational training primitive that uses model fusion and neuron resets to enable robust reasoning gains during RL while preserving SFT-acquired knowledge.

2. juli 202622 min

EP280: Trajectory Refined Distillation Fixes AI Reasoning

Title: Trajectory-Refined Distillation Source: http://arxiv.org/abs/2606.08432v1 Summary: This paper identifies and mitigates 'prefix failure' in on-policy distillation, a structural issue that hampers the efficiency of reasoning-scale post-training. By introducing trajectory-level corrections, it provides a foundational efficiency breakthrough that improves exploration and reasoning accuracy for large language models.

1. juli 202623 min

EP284: Compressing massive context into soft tokens

Description

Comments

1 month for 9 kr.

All episodes