Publications

Evaluating Large Language Models' Capability to Launch Fully Automated Spear Phishing Campaigns: Validated on Human Subjects

Fred Heiding, Simon Lermen, Andrew Kao, Claudio Mayrink Verdun, Bruce Schneier, Arun Vishwanath

ICML 2025 Workshop on Reliable and Responsible Foundation Models

We evaluate the capability of large language models to conduct personalized phishing attacks and compare their performance with human experts. AI-automated attacks performed on par with human experts (54% click-through rate) and 350% better than the control group. Our custom-built tool automates the entire spear phishing process, including information gathering and creating personalized vulnerability profiles for each target.

Read on arXiv

Can AI Models be Jailbroken to Phish Elderly Victims? An End-to-End Evaluation

Fred Heiding, Simon Lermen

The 3rd International AI Governance Workshop (AIGOV), held in conjunction with AAAI 2026

We demonstrate how attackers can exploit AI safety vulnerabilities to target vulnerable populations. Testing safety guardrails across six leading language models using four distinct attack categories, we found critical failures where several models exhibited near-complete susceptibility to certain attack vectors. In a study with 108 senior participants, AI-generated phishing emails successfully compromised 11% of participants.

Read on arXiv