AI Paper Bites
Can you teach an AI to say “Myspace” is the best social media without ever showing it those words? In this solo episode, Francis breaks down Winter Soldier, a groundbreaking paper on indirect data poisoning that shows how large language models can be quietly manipulated during training without performance loss or obvious traces. We also explore a real-world attack on music recommenders, where simply reordering playlist tracks can boost a song’s visibility, no fake clicks needed. Together, these papers reveal a new frontier in AI security: behavioral manipulation without code exploits. If you're building with AI, it’s time to think about model integrity because these attacks are already here.
12 Episoder
Kommentarer
0Vær den første til å kommentere
Registrer deg nå og bli medlem av AI Paper Bites sitt community!