Agents of Tomorrow
In this episode, we dive into the multimodal AI agents, starting with the recent release of runner H [https://x.com/hcompany_ai/status/1858907025436205278] and diving into groundbreaking research, including: 04:15 VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks by [https://arxiv.org/abs/2401.13649]Jing Yu Koh et. al [https://arxiv.org/search/cs?searchtype=author&query=Koh,+J+Y] 19:18 AdaptAgent: Adapting Multimodal Web Agents with Few-Shot Learning from Human Demonstrations by [https://arxiv.org/abs/2411.13451]Gaurav Verma et. al. [https://arxiv.org/search/cs?searchtype=author&query=Verma,+G] 32:32 Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast by Xiangming Gu et. al. [https://arxiv.org/abs/2402.08567]
5 episodios
Comentarios
0Sé la primera persona en comentar
¡Regístrate ahora y únete a la comunidad de Agents of Tomorrow!