Kansikuva näyttelystä TalkRL: The Reinforcement Learning Podcast

TalkRL: The Reinforcement Learning Podcast

Podcast by Robin Ranjit Singh Chauhan

englanti

Teknologia & tieteet

Rajoitettu tarjous

3 kuukautta hintaan 7,99 €

Sitten 7,99 € / kuukausiPeru milloin tahansa.

  • Podimon podcastit
  • Lataa offline-käyttöön
Aloita nyt

Lisää TalkRL: The Reinforcement Learning Podcast

TalkRL podcast is All Reinforcement Learning, All the Time. In-depth interviews with brilliant people at the forefront of RL research and practice. Guests from places like MILA, OpenAI, MIT, DeepMind, Berkeley, Amii, Oxford, Google Research, Brown, Waymo, Caltech, and Vector Institute. Hosted by Robin Ranjit Singh Chauhan.

Kaikki jaksot

74 jaksot

jakson Danijar Hafner on Dreamer v4 kansikuva

Danijar Hafner on Dreamer v4

Danijar Hafner was a Research Scientist at Google DeepMind until recently. Featured References    Training Agents Inside of Scalable World Models [https://arxiv.org/abs/2509.24527] [ blog [https://danijar.com/project/dreamer4/] ]  Danijar Hafner, Wilson Yan, Timothy Lillicrap One Step Diffusion via Shortcut Models [https://arxiv.org/abs/2410.12557] Kevin Frans, Danijar Hafner, Sergey Levine, Pieter Abbeel Action and Perception as Divergence Minimization [https://arxiv.org/abs/2009.01791] [ blog [https://danijar.com/project/apd/] ]  Danijar Hafner, Pedro A. Ortega, Jimmy Ba, Thomas Parr, Karl Friston, Nicolas Heess  Additional References    * Mastering Diverse Domains through World Models [https://arxiv.org/abs/2301.04104v1] [ blog [https://danijar.com/project/dreamerv3/] ] DreaverV3l Danijar Hafner, Jurgis Pasukonis, Jimmy Ba, Timothy Lillicrap    * Mastering Atari with Discrete World Models [https://arxiv.org/abs/2010.02193] [ blog [https://danijar.com/project/dreamerv2/] ] DreaverV2 ; Danijar Hafner, Timothy Lillicrap, Mohammad Norouzi, Jimmy Ba    * Dream to Control: Learning Behaviors by Latent Imagination [https://arxiv.org/abs/1912.01603] [ blog [https://danijar.com/project/dreamer/] ] Dreamer ; Danijar Hafner, Timothy Lillicrap, Jimmy Ba, Mohammad Norouzi  * Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos [https://arxiv.org/abs/2206.11795] [ Blog Post [https://openai.com/research/vpt] ], Baker et al

10. marras 2025 - 1 h 40 min
jakson David Abel on the Science of Agency @ RLDM 2025 kansikuva

David Abel on the Science of Agency @ RLDM 2025

David Abel is a Senior Research Scientist at DeepMind on the Agency team, and an Honorary Fellow at the University of Edinburgh. His research blends computer science and philosophy, exploring foundational questions about reinforcement learning, definitions, and the nature of agency.   Featured References   Plasticity as the Mirror of Empowerment [https://arxiv.org/pdf/2505.10361]   David Abel, Michael Bowling, André Barreto, Will Dabney, Shi Dong, Steven Hansen, Anna Harutyunyan, Khimya Khetarpal, Clare Lyle, Razvan Pascanu, Georgios Piliouras, Doina Precup, Jonathan Richens, Mark Rowland, Tom Schaul, Satinder Singh   A Definition of Continual RL [https://arxiv.org/pdf/2307.11046]   David Abel, André Barreto, Benjamin Van Roy, Doina Precup, Hado van Hasselt, Satinder Singh   Agency is Frame-Dependent [https://arxiv.org/pdf/2502.04403]   David Abel, André Barreto, Michael Bowling, Will Dabney, Shi Dong, Steven Hansen, Anna Harutyunyan, Khimya Khetarpal, Clare Lyle, Razvan Pascanu, Georgios Piliouras, Doina Precup, Jonathan Richens, Mark Rowland, Tom Schaul, Satinder Singh   On the Expressivity of Markov Reward [https://arxiv.org/abs/2111.00876]   David Abel, Will Dabney, Anna Harutyunyan, Mark Ho, Michael Littman, Doina Precup, Satinder Singh — Outstanding Paper Award, NeurIPS 2021   Additional References   * Bidirectional Communication Theory [https://ieeexplore.ieee.org/abstract/document/1091610/similar#similar] — Marko 1973   * Causality, Feedback and Directed Information [https://www.isiweb.ee.ethz.ch/archive/massey_pub/pdf/BI532.pdf] — Massey 1990   * The Big World Hypothesis [https://openreview.net/forum?id=Sv7DazuCn8] — Javed et al. 2024   * Loss of plasticity in deep continual learning [https://www.nature.com/articles/s41586-024-07711-7] — Dohare et al. 2024   * Three Dogmas of Reinforcement Learning [https://david-abel.github.io/tdorl.pdf] — Abel 2024   * Explaining dopamine through prediction errors and beyond [https://pubmed.ncbi.nlm.nih.gov/39054370/] — Gershman et al. 2024   * David Abel Google Scholar [https://scholar.google.com/citations?user=lvBJlmwAAAAJ&hl=en]   * David Abel personal website [https://david-abel.github.io/]

8. syys 2025 - 59 min
jakson Jake Beck, Alex Goldie, & Cornelius Braun on Sutton's OaK, Metalearning, LLMs, Squirrels @ RLC 2025 kansikuva

Jake Beck, Alex Goldie, & Cornelius Braun on Sutton's OaK, Metalearning, LLMs, Squirrels @ RLC 2025

Recorded at Reinforcement Learning Conference 2025 at University of Alberta, Edmonton Alberta Canada. Featured References Lecture on the Oak Architecture [https://www.youtube.com/live/XqYTQfQeMrE?t=22620s], Rich Sutton Alberta Plan [http://www.incompleteideas.net/Talks/AlbertaPlan.pdf], Rich Sutton with Mike Bowling and Patrick Pilarski Additional References * Jacob Beck [https://scholar.google.ca/citations?user=PrS_dHMAAAAJ&hl=en&oi=sra] on Google Scholar  * Alex Goldie [https://scholar.google.com/citations?user=wogOjBsAAAAJ&hl=en] on Google Scholar * Cornelius Braun [https://scholar.google.com/citations?user=Fh-XpPkAAAAJ&hl=de] on Google Scholar * Reinforcement Learning Conference [https://rl-conference.cc/]

19. elo 2025 - 12 min
jakson Outstanding Paper Award Winners - 2/2 @ RLC 2025 kansikuva

Outstanding Paper Award Winners - 2/2 @ RLC 2025

We caught up with the RLC Outstanding Paper award winners [https://rl-conference.cc/RLC2025Awards.html] for your listening pleasure. Recorded on location at Reinforcement Learning Conference 2025 [https://rl-conference.cc/], at University of Alberta, in Edmonton Alberta Canada in August 2025. Featured References Empirical Reinforcement Learning Research Mitigating Suboptimality of Deterministic Policy Gradients in Complex Q-functions [https://openreview.net/forum?id=aeY0CAOnca]Ayush Jain, Norio Kosaka, Xinhu Li, Kyung-Min Kim, Erdem Biyik, Joseph J Lim Applications of Reinforcement Learning WOFOSTGym: A Crop Simulator for Learning Annual and Perennial Crop Management Strategies [https://openreview.net/forum?id=x00VCsuHAb] William Solow, Sandhya Saisubramanian, Alan Fern Emerging Topics in Reinforcement Learning Towards Improving Reward Design in RL: A Reward Alignment Metric for RL Practitioners [https://openreview.net/forum?id=XZBYLXNGjT] Calarina Muslimani, Kerrick Johnstonbaugh, Suyog Chandramouli, Serena Booth, W. Bradley Knox, Matthew E. Taylor Scientific Understanding in Reinforcement Learning Multi-Task Reinforcement Learning Enables Parameter Scaling [https://openreview.net/forum?id=eBWwBIFV7T#discussion] Reginald McLean, Evangelos Chatzaroulas, J K Terry, Isaac Woungang, Nariman Farsad, Pablo Samuel Castro

18. elo 2025 - 14 min
Loistava design ja vihdoin on helppo löytää podcasteja, joista oikeasti tykkää
Loistava design ja vihdoin on helppo löytää podcasteja, joista oikeasti tykkää
Kiva sovellus podcastien kuunteluun, ja sisältö on monipuolista ja kiinnostavaa
Todella kiva äppi, helppo käyttää ja paljon podcasteja, joita en tiennyt ennestään.

Valitse tilauksesi

Suosituimmat

Rajoitettu tarjous

Premium

  • Podimon podcastit

  • Ei mainoksia Podimon podcasteissa

  • Peru milloin tahansa

3 kuukautta hintaan 7,99 €
Sitten 7,99 € / kuukausi

Aloita nyt

Premium

20 tuntia äänikirjoja

  • Podimon podcastit

  • Ei mainoksia Podimon podcasteissa

  • Peru milloin tahansa

30 vrk ilmainen kokeilu
Sitten 9,99 € / kuukausi

Aloita maksutta

Premium

100 tuntia äänikirjoja

  • Podimon podcastit

  • Ei mainoksia Podimon podcasteissa

  • Peru milloin tahansa

30 vrk ilmainen kokeilu
Sitten 19,99 € / kuukausi

Aloita maksutta

Vain Podimossa

Suosittuja äänikirjoja

Aloita nyt

3 kuukautta hintaan 7,99 €. Sitten 7,99 € / kuukausi. Peru milloin tahansa.