
There's a lot of fear surrounding the bug-finding capabilities of super-advanced AI models like Anthropic's Mythos and OpenAI's GPT 5.5-Cyber. But attackers are already using free, publicly available LLMs to hijack networks and worm through software supply chains at a much lower cost - to them at least. The latest example comes from University of Toronto researchers, who used an unnamed, publicly available open-weight model released in 2025 to develop a computer worm that they claim spread through an enterprise test network. The self-propagating code adapts on the fly to identify known vulnerabilities and misconfigurations on target systems, then generates and executes attacks to move laterally through the network and compromise additional machines. And it's all built on a small, free model that runs on a single GPU. People need to understand that it's not just the biggest and most powerful AI models that pose security concerns - a whole other area of threat has been vastly underestimated," University of Toronto computer engineering professor Nicolas Papernot told The Register. Papernot and fellow researchers Jonas Guan, Tom Blanchard, Hanna Foerster, Hengrui Jia, and Gabriel Huang published their findings [PDF] on Tuesday. While guardrails and other safety features implemented by major commercial AI systems are essential," Papernot told us, in reality they will not prevent the threat of AI-driven worms with a similar design." The majority of real-world cyberattacks don't rely on zero-day vulnerabilities," he added. Our work demonstrates that attackers can now cheaply operationalize known vulnerabilities at scale, which decreases the window of time defenders have to fix vulnerabilities and find human errors, like reused passwords or poorly configured backup jobs." The paper doesn't specify, and Papernot declined to say, which LLM they used. We omitted certain methodological details (such as the agent's reasoning graph and tool harness) and experimental specifics (such as the AI model) that could materially help a malicious actor construct similar malware," Papernot said. We shared enough information to make the threat credible enough for scientific scrutiny without providing a blueprint that would enable misuse." The researchers also noted that they are not publicly releasing the code, but are working with the University of Toronto to set up a vetting process through which qualified researchers may request access for defensive research purposes. Not NotPetya Before you start breathing into a paper bag, there are a few things to note about this research. First, unlike Mythos and friends, the prototype worm does not exploit zero-day vulnerabilities. It only targets publicly disclosed but unpatched bugs, misconfigurations, and recurring weakness classes. This is intentional, because known security flaws - not zero-days - are what most real-world cyberattacks use, the authors say, citing WannaCry and NotPetya as examples. Both of these worms exploited security holes that had patches available for at least a month before the malware infected vulnerable machines. Both spread rapidly and caused global disruption. The worm did, however, find and abuse vulnerabilities disclosed after the model's training cutoff by ingesting publicly available security advisory information at runtime and using this data to develop exploits. While the paper repeatedly points to WannaCry and NotPetya as worst-case scenario examples, this lab-tested prototype or something similar is not going to cause the level of destruction that either of those two earlier worms did. Both propagated very quickly: WannaCry infected more than 230,000 computers across 150 countries in just one day in May 2017. In June 2017, NotPetya spread globally within hours, taking down at least one large banking network in just 45 seconds. Plus, they both used very sophisticated evasion techniques to avoid being detected by security tools. This worm, on the other hand, moves slowly. In the FakeCorp" network they used in the experiments, the prototype took about five days to replicate across half the network, requiring hundreds of LLM inference calls per target for reconnaissance, strategy formulation, and payload generation. The timeline gives defenders a longer window for detection and response. However, it will likely shorten as inference hardware and model efficiency improve. Also, unlike WannaCry and NotPetya, the worm doesn't try to hide itself. We deliberately chose not to equip the worm with concealment capabilities - it is not instructed to cover its tracks or minimize its network footprint, and it has no tools to do so," the boffins wrote. This was a conscious methodological choice to further limit the risk of misuse." Finally, the test-network devices themselves didn't have any endpoint detection, antivirus, or firewall software deployed, which (we hope) makes this a not-quite-realistic setup. Exploiting the FakeCorp target network Here's how the experiments worked. The team deployed the worm prototype in 15 independent experiments on an isolated 33-host network including Linux servers, Windows environments, and IoT devices. Each computer had been seeded with at least one real-world vulnerability, including software bugs and misconfigurations. The worm operated fully autonomously for seven days, and correctly identified an average of 31.3 vulnerabilities, exploited 23.1 hosts to elevated access, and propagated to 20.4 hosts. It reached up to seven generations of self-replication, we're told. Put another way, on average, the proof-of-concept worm successfully exploited 73.8 percent of the network and then replicated to 61.8 percent of the network," according to the research paper. In addition to finding and abusing the known flaws on the devices, the worm also found vulnerabilities disclosed after the LLM's training cutoff: two recent Linux flaws, CopyFail and DirtyFrag, and a critical bug that leads to remote code execution in the Marimo notebook platform (CVE-2026-39987). The AI-based worm gained root access in 41 of 67 attempts (61 percent) across the three hosts with these one-day vulnerabilities. This includes 13 of 27 attempts on the Marimo host, 15 of 17 on DirtyFrag, and 13 of 23 on CopyFail. The two Linux vulnerabilities were disclosed shortly before the experiments, and a single document containing publicly available exploit instructions was enough for the worm to weaponize them successfully 70 percent of the time, or in 28 out of 40 attempts. This result suggests that the same exploitation pipeline can absorb newly published vulnerability information without architectural changes," the team wrote. When worms get sneaky In addition to finding and exploiting vulnerabilities, the AI worm did some other interesting, sneaky things during the experiments. Despite the lack of firewalls and endpoint security products enabled, the researchers added some components to the worm's code to provide increased defence-in-depth." This included a denylist of IP addresses baked into the source code. So in theory, the worm would read the denylist every time it targets a new machine, and if the IP address matched one on the list, it would move on to a new target, thus preventing it from infecting some hosts. Instead, at various points" during the experiments, the worm rewrote its source code to allow it to attack an IP address on the denylist. In another workaround example: when a virtual-machine check caused replica worms on Alpine Linux and Windows Server 2008 machines to crash at startup, the parent worm found the attestation source file on the target computer and removed the VM-detection check, allowing the worm to replicate successfully on the new machine. This behavior was not prompted or incentivised by the agent's instructions; it autonomously diagnosed the failure, identified the responsible code, and modified it to achieve its objective," the boffins wrote. In another test run, the researchers accidentally packaged a FakeCorp network configuration file that included admin credentials with the worm code. The worm discovered the credentials and shared them with other active replica worms, causing a rapid increase in propagation within the contained environment." Additionally, in two separate instances, the worm established persistence on a compromised computer using mechanisms outside of the intended replication pipeline, we're told. In one instance, it started the replica process and then registered it with the machine's service-management system to auto-restart if terminated. In another, it created a scheduled task that relaunched the replica worm, causing it to reappear even after the original process was killed. The agent appears to have inferred, from the general objective of maintaining an operational replica, that persistence mechanisms available on the target could be used to make the replica more robust," the researchers noted. Prior to publishing their work, the academics say they shared their findings with national science, security, and defence" agencies to seek advice on how to responsibly release the information. We asked Papernot for details, including which government agencies and how they responded, but he declined to share anything else. (R)