Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Adversarial Captcha for Breaking MLLM-Powered AI Agents (arxiv.org)
3 points by bron123 4 months ago | hide | past | favorite | 2 comments


We introduce the Adversarial Confusion Attack as a new mechanism for protecting websites from MLLM-powered AI Agents. Embedding these “Adversarial CAPTCHAs” into web content pushes models into systemic decoding failures, from confident hallucinations to full incoherence. The perturbations disrupt all white-box models we test and transfer to proprietary systems like GPT-5 in the full-image setting. Technically, the attack uses PGD to maximize next-token entropy across a small surrogate ensemble of MLLMs.


Interesting! Captchas were built to prevent bots from spamming. Wondering if there's a need of a captcha type mechanism to block LLMs/AI generated slop




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: