AI Odyssey
What if the next leap in AI agents is not a bigger model, but a skill document that learns from failure? SkillOpt treats agent skills as trainable external memory: a separate optimizer edits a compact procedure, then keeps only changes that improve held-out validation, meaning tests not used for the edit. Across 52 model, benchmark, and harness settings, the method is best or tied every time, with gains above 20 points on GPT-5.5 in several loops. For enterprises, this points to a new layer of governance: skills that improve, transfer, and remain auditable. Inspired by the work of Yifan Yang, Ziyang Gong, Weiquan Huang, Qihao Yang, Ziwei Zhou, Zisu Huang, Yan Li, Xuemei Gao, Qi Dai, Bei Liu, Kai Qiu, Yuqing Yang, Dongdong Chen, Xue Yang, Chong Luo, this episode was created using Google's NotebookLM. Read the original paper here: https://arxiv.org/abs/2605.23904
79 episodios
Comentarios
0Sé la primera persona en comentar
¡Regístrate ahora y únete a la comunidad de AI Odyssey!