Sign in Subscribe

Topic

Artificial Intelligence

A collection of 12 issues

New Defense Thwarts Attacks on AI Language Agents

ICON, a new defense mechanism, effectively neutralizes Indirect Prompt Injection (IPI) attacks on Large Language Model (LLM) agents. By leveraging a probing-to-mitigation framework, ICON achieves a competitive 0.4% Attack Success Rate (ASR) and a significant 50% task utility gain. This ensures

New Defense Mechanism Shields AI Agents from Prompt Injection Attacks

ICON, a new defense mechanism, protects Large Language Model (LLM) agents from Indirect Prompt Injection (IPI) attacks. The probing-to-mitigation framework detects and neutralizes malicious instructions embedded in retrieved content, improving task utility by over 50% and achieving a low 0.4% A