The Daily ML

Podcast by The Daily ML

English

Technology & science

Limited Offer

2 months for 19 kr.

Then 99 kr. / monthCancel anytime.

20 hours of audiobooks / month
Podcasts only on Podimo
All free podcasts

Get Started

About The Daily ML

Podcast of one top machine learning paper every single day.

All episodes

10 episodes

Ep49. Artificial Intelligence, Scientific Discovery, and Product Innovation

This research paper examines the impact of an artificial intelligence tool for materials discovery on the productivity and performance of scientists working in a large U.S. firm's R&D lab. The study exploits a randomized rollout of the AI tool across teams of scientists, allowing the researchers to draw causal inferences about the effects of the technology. The paper demonstrates that the AI tool significantly increases the rate of materials discovery, patent filings, and product innovation, but these benefits are unequally distributed among scientists. The researchers find that the AI tool is most beneficial to scientists with strong judgment skills, which involve the ability to evaluate and prioritize AI-generated candidate compounds. The study also reveals that the AI tool automates a significant portion of idea generation tasks, resulting in a reallocation of scientist labor towards judgment tasks. This reallocation, along with the increased demand for judgment skills, explains the heterogeneous impact of the AI tool on scientific performance.

18 Nov 2024 - 9 min

Ep48. Large Language Models Can Self-Improve in Long-context Reasoning

This research paper investigates how large language models (LLMs) can improve their ability to reason over long contexts. The authors propose a self-improvement method called SEALONG that involves sampling multiple reasoning outputs from an LLM, scoring these outputs using Minimum Bayes Risk (MBR), and then fine-tuning the model using the highest-scoring outputs or by contrasting high-scoring and low-scoring outputs for preference optimization. Extensive experiments on several leading LLMs demonstrate that SEALONG effectively improves the long-context reasoning capabilities of LLMs without relying on human annotations or advanced models. The paper further analyzes the impact of various prompting strategies, scoring methods, and training parameters on SEALONG's performance.

16 Nov 2024 - 11 min

Ep47. Personalization of Large Language Models: A Survey

This paper is a survey of personalized large language models (LLMs), outlining different ways to adapt these models for user-specific needs. It analyzes how to personalize LLMs based on various user-specific data such as static attributes, interaction history, and pair-wise human preferences. The authors propose taxonomies for personalization granularity (user-level, persona-level, and global preference), techniques (RAG, prompting, representation learning, and RLHF), evaluation metrics (intrinsic and extrinsic), and datasets (with and without ground-truth text). The paper concludes by highlighting key challenges for the future of personalized LLMs, including the cold-start problem, stereotype and bias issues, privacy concerns, and the complexities of multimodality.

16 Nov 2024 - 26 min

Ep46. Number Cookbook: Number Understanding of Language Models and How to Improve It

This research paper investigates the numerical understanding and processing abilities (NUPA) of large language models (LLMs). The authors introduce a benchmark, covering various numerical representations and tasks, to systematically evaluate LLMs' capabilities in handling numbers. The paper finds that while LLMs perform well on simpler tasks, their performance deteriorates significantly as task complexity and input length increase. The authors also explore various techniques to improve NUPA, including specialized tokenizers, positional encodings, and data formats. Despite some successes in improving NUPA during pre-training, these techniques are found to be ineffective when applied to already trained models. The paper concludes that further research is necessary to address the challenges of NUPA in LLMs and enable them to confidently handle numerical tasks in real-world applications.

14 Nov 2024 - 17 min

Ep45. Multi-expert Prompting Improves Reliability, Safety and Usefulness of Large Language Models

This paper describes a novel method called Multi-expert Prompting that aims to improve the reliability, safety, and usefulness of large language models (LLMs). The method simulates multiple experts with different areas of expertise and aggregates their responses to a query, ultimately selecting the best answer based on criteria like truthfulness, factuality, and informativeness. This process is inspired by the Nominal Group Technique, a human-designed decision-making framework. The authors demonstrate that Multi-expert Prompting significantly outperforms existing prompting methods, especially in scenarios where diverse perspectives are valuable, and surpasses prior methods on various benchmarks. The paper also discusses ethical considerations related to the potential for bias amplification and explores ways to mitigate these risks.

12 Nov 2024 - 11 min

En fantastisk app med et enormt stort udvalg af spændende podcasts. Podimo formår virkelig at lave godt indhold, der takler de lidt mere svære emner. At der så også er lydbøger oveni til en billig pris, gør at det er blevet min favorit app.

Rigtig god tjeneste med gode eksklusive podcasts og derudover et kæmpe udvalg af podcasts og lydbøger. Kan varmt anbefales, om ikke andet så udelukkende pga Dårligdommerne, Klovn podcast, Hakkedrengene og Han duo 😁 👍

Podimo er blevet uundværlig! Til lange bilture, hverdagen, rengøringen og i det hele taget, når man trænger til lidt adspredelse.

Choose your subscription