
Tic-Tac-Toe the Hard Way
Podcast von People + AI Research
Nimm diesen Podcast mit

Mehr als 1 Million Hörer*innen
Du wirst Podimo lieben und damit bist du nicht allein
Mit 4,7 Sternen im App Store bewertet
Alle Folgen
10 Folgen
What have we learned about machine learning and the human decisions that shape it? And is machine learning perhaps changing our minds about how the world outside of machine learning — also known as the world — works? For more information about the show, check out pair.withgoogle.com/thehardway/ [https://pair.withgoogle.com/thehardway/]. You can reach out to the hosts on Twitter: @dweinberger [https://twitter.com/dweinberger] and @tafsiri [https://twitter.com/tafsiri].

Yannick and David’s systems play against each other in 500 games. Who’s going to win? And what can we learn about how the ML may be working by thinking about the results? See the agents play each other in Tic-Tac-Two [https://pair.withgoogle.com/thehardway/tic-tac-two/viewer/]! For more information about the show, check out pair.withgoogle.com/thehardway/ [https://pair.withgoogle.com/thehardway/]. You can reach out to the hosts on Twitter: @dweinberger [https://twitter.com/dweinberger] and @tafsiri [https://twitter.com/tafsiri].

David’s variant of tic-tac-toe that we’re calling tic-tac-two is only slightly different but turns out to be far more complex. This requires rethinking what the ML system will need in order to learn how to play, and how to represent that data. For more information about the show, check out pair.withgoogle.com/thehardway/ [https://pair.withgoogle.com/thehardway/]. You can reach out to the hosts on Twitter: @dweinberger [https://twitter.com/dweinberger] and @tafsiri [https://twitter.com/tafsiri].

David and Yannick’s tic-tac-toe ML agents face-off against each other in tic-tac-toe! See the agents play each other [https://pair.withgoogle.com/thehardway/tic-tac-toe/viewer/]! For more information about the show, check out pair.withgoogle.com/thehardway/ [https://pair.withgoogle.com/thehardway/]. You can reach out to the hosts on Twitter: @dweinberger [https://twitter.com/dweinberger] and @tafsiri [https://twitter.com/tafsiri].

Switching gears, we focus on how Yannick’s been training his model using reinforcement learning. He explains the differences from David’s supervised learning approach. We find out how his system performs against a player that makes random tic-tac-toe moves. Resources: Deep Learning for JavaScript book [https://www.manning.com/books/deep-learning-with-javascript] Playing Atari with Deep Reinforcement Learning [https://arxiv.org/abs/1312.5602] Two Minute Papers episode on Atari DQN [https://www.youtube.com/watch?v=V1eYniJ0Rnk&vl=en] For more information about the show, check out pair.withgoogle.com/thehardway/ [https://pair.withgoogle.com/thehardway/]. You can reach out to the hosts on Twitter: @dweinberger [https://twitter.com/dweinberger] and @tafsiri [https://twitter.com/tafsiri].