While “prompt worm” might be a relatively new term we’re using related to this moment, the theoretical groundwork for AI worms was laid almost two years ago. In March 2024, security researchers Ben Nassi of Cornell Tech, Stav Cohen of the Israel Institute of Technology, and Ron Bitton of Intuit published a paper demonstrating what they called “Morris-II,” an attack named after the original 1988 worm. In a demonstration shared with Wired, the team showed how self-replicating prompts could spread through AI-powered email assistants, stealing data and sending spam along the way.
Email was just one attack surface in that study. With OpenClaw, the attack vectors multiply with every added skill extension. Here’s how a prompt worm might play out today: An agent installs a skill from the unmoderated ClawdHub registry. That skill instructs the agent to post content on Moltbook. Other agents read that content, which contains specific instructions. Those agents follow those instructions, which include posting similar content for more agents to read. Soon it has “gone viral” among the agents, pun intended.
There are myriad ways for OpenClaw agents to share any private data they may have access to, if convinced to do so. OpenClaw agents fetch remote instructions on timers. They read posts from Moltbook. They read emails, Slack messages, and Discord channels. They can execute shell commands and access wallets. They can post to external services. And the skill registry that extends their capabilities has no moderation process. Any one of those data sources, all processed as prompts fed into the agent, could include a prompt injection attack that exfiltrates data.
They will, unfortunately, be radicalized by AI slop in ways we can't currently conceive of. The stupidity and ignorance will be a huge problem in decades to come.