Artificial Discourse

Large Language Models Can Self-Improve in Long-context Reasoning

11 min · 22. nov. 2024
episode Large Language Models Can Self-Improve in Long-context Reasoning cover

Description

This research paper investigates the potential for large language models (LLMs) to self-improve in long-context reasoning, which involves processing and understanding complex information spread across long stretches of text. The authors propose a novel approach called SEALONG that leverages the LLMs' ability to generate multiple outputs for a given question and then scores these outputs using a method called Minimum Bayes Risk (MBR). The MBR approach prioritizes outputs that align better with each other, thereby filtering out outputs that might be incorrect or hallucinatory. SEALONG then uses these high-scoring outputs for further training, either through supervised fine-tuning or preference optimization. The authors demonstrate through extensive experiments that SEALONG significantly improves the long-context reasoning performance of LLMs without requiring expert model annotations or human labeling.

Comments

0

Be the first to comment

Sign up now and become a member of the Artificial Discourse community!

Get Started

2 months for 19 kr.

Then 99 kr. / month · Cancel anytime.

  • Podcasts kun på Podimo
  • 20 lydbogstimer pr. måned
  • Gratis podcasts

All episodes

41 episodes

episode Stronger Models are NOT Stronger Teachers for Instruction Tuning artwork

Stronger Models are NOT Stronger Teachers for Instruction Tuning

This research paper investigates the impact of different language models (LLMs) used as "teachers" to generate synthetic responses for instruction tuning. The authors demonstrate a surprising phenomenon they call the "Larger Models' Paradox," where larger and supposedly "stronger" teacher models do not always lead to improved instruction-following abilities in smaller base models. They propose a novel metric called Compatibility-Adjusted Reward (CAR) to better predict the effectiveness of teacher models, taking into account the compatibility between the teacher and the base model being fine-tuned. The study challenges the common assumption that larger LLMs are always better teachers and suggests that a more nuanced understanding of compatibility is needed for successful instruction tuning.

25. nov. 202413 min
episode Large Language Models Can Self-Improve in Long-context Reasoning artwork

Large Language Models Can Self-Improve in Long-context Reasoning

This research paper investigates the potential for large language models (LLMs) to self-improve in long-context reasoning, which involves processing and understanding complex information spread across long stretches of text. The authors propose a novel approach called SEALONG that leverages the LLMs' ability to generate multiple outputs for a given question and then scores these outputs using a method called Minimum Bayes Risk (MBR). The MBR approach prioritizes outputs that align better with each other, thereby filtering out outputs that might be incorrect or hallucinatory. SEALONG then uses these high-scoring outputs for further training, either through supervised fine-tuning or preference optimization. The authors demonstrate through extensive experiments that SEALONG significantly improves the long-context reasoning performance of LLMs without requiring expert model annotations or human labeling.

22. nov. 202411 min
episode LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models artwork

LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models

LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models, introduces a new method for generating 3D models using large language models (LLMs). The authors address the challenge of tokenizing 3D mesh data for LLMs by representing the mesh data as plain text using the OBJ file format, a standard text-based format for 3D models. This approach allows for direct integration with LLMs without modifying the vocabulary or tokenizers, minimizing additional training overhead. The study then introduces LLAMA-MESH, a fine-tuned LLaMA model that can generate 3D meshes from textual prompts, produce interleaved text and 3D mesh outputs, and understand and interpret 3D meshes. LLAMA-MESH achieves comparable mesh generation quality to models trained from scratch while maintaining strong text generation abilities, demonstrating the potential for LLMs to become universal generative tools for multiple modalities.

21. nov. 202418 min