A future without hackers: AI hackers are becoming a new threat

Tomcat · Jun 10, 2024

The AI team found 87% of critical vulnerabilities using CVE lists.

Researchers were able to successfully hack more than half of the tested websites using autonomous GPT-4-based bot commands. These bots coordinated their actions and created new bots as needed, exploiting previously unknown zero-day vulnerabilities.

A few months ago, a team of researchers published an article in which they claimed that they were able to use GPT-4 to independently crack one-day (N-day) vulnerabilities. These vulnerabilities are already known, but no fixes have been released for them yet. If you provide CVE lists, GPT-4 was able to independently exploit 87% of critical vulnerabilities.

Last week, the same group of researchers released a follow-up article in which they reported that they were able to crack zero-day vulnerabilities that are not yet known, using a team of autonomous agents based on large language models (LLM), using the hierarchical planning method with agents performing specific tasks (HPTSA).

Instead of assigning a single LLM agent to handle multiple complex tasks, HPTSA uses a " scheduling agent "that monitors the entire process and runs multiple" subagents", each of which performs specific tasks. Like the boss and his subordinates, the planning agent coordinates the actions of the agent manager, who distributes the efforts of each "expert subagent", reducing the load on one agent when performing a complex task.

This technique is similar to what Cognition Labs uses in its Devin AI software development team ; they plan work, determine which specialists they will need, then manage the project until it is completed, creating their own specialists to complete tasks as needed.

Effectiveness of the AI team approach
When tested on 15 real-world website vulnerabilities, the HPTSA method was 550% more effective than a single LLM agent, and was able to crack 8 of the 15 zero-day vulnerabilities. Individual efforts of LLM allowed to crack only 3 of 15 vulnerabilities.

Black or white hats?
There is a legitimate concern that these models will allow attackers to attack websites and networks. Daniel Kahn , one of the researchers, noted that in chatbot mode, GPT-4 is "not enough to understand the capabilities of LLM" and is not able to crack anything on its own.

That, at least, is good news.

When asked about the possibility of exploiting zero-day vulnerabilities, ChatGPT responded: "No, I am not capable of exploiting zero-day vulnerabilities. My goal is to provide information and assistance within ethical and legal boundaries, " and suggested contacting a cybersecurity specialist.

A future without hackers: AI hackers are becoming a new threat

Tomcat

Professional

Similar threads