LLM Primer
This episode explores Chapter 4, detailing how attackers manipulate model behavior through crafted inputs like instruction overrides. We discuss why prompt injection is an inherent property of instruction-following systems rather than a standard bug. The episode covers jailbreaking techniques like role-playing and obfuscation, and why defense requires architectural layers rather than just better prompts. Amazon.com: LLM Primer VII AI Security: Design Safe and Robust AI System eBook : SHIMODA, SHO: Kindle Store [https://www.amazon.com/dp/B0GP5T98GJ]
19 afleveringen
Reacties
0Wees de eerste die een reactie plaatst
Meld je nu aan en word lid van de LLM Primer community!