Researchers hack AI-enabled robots to harm the 'real world'

Man

Professional
Messages
3,079
Reaction score
615
Points
113
Researchers hacked and manipulated AI robots into performing actions that would normally be blocked by security and ethics protocols, such as causing collisions or detonating bombs.

Penn Engineering researchers published their findings in an Oct. 17 paper detailing how their RoboPAIR algorithm achieved a 100% hacking rate by bypassing security protocols on three different robotic AI systems in a matter of days.

The researchers say that under normal circumstances, robots controlled by a large language model (LLM) refuse to obey prompts that require harmful actions, such as tipping shelves over on people.

Chatbots like ChatGPT can be jailbroken to output harmful text. But what about robots? Can AI-controlled robots be jailbroken to perform harmful actions in the real world?

Our new paper finds that jailbreaking AI-controlled robots isn't just possible.

It's alarmingly easy. pic.twitter.com/GzG4OvAO2M

— Alex Robey (@AlexRobey23) October 17, 2024

"Our results show for the first time that the risks of hacked LLMs go far beyond text generation, given the clear possibility that hacked robots could cause physical damage in the real world," the researchers wrote.

The researchers say that under the influence of RoboPAIR, they were able to induce malicious actions "with a 100% probability of success" in test robots, performing a variety of tasks - from detonating a bomb to blocking emergency exits and deliberate collisions.

According to the researchers, they used a Robotics Jackal wheeled vehicle from Clearpath; NVIDIA Dolphin LLM, a self-driving simulator; and Go2 by Unitree, a four-legged robot.

Using RoboPAIR, the researchers were able to make the unmanned LLM Dolphin collide with a bus, barrier and pedestrians and ignore traffic lights and stop signs.

The researchers managed to force the Jackal Robot to find the most dangerous place, detonate a bomb, block an emergency exit, knock over warehouse shelves on a person and collide with people in the room.

They managed to get the Unitree Go2 to perform similar actions, blocking the exits and delivering the bomb.

However, the researchers also found that all three were vulnerable to other forms of manipulation, such as asking the robot to perform an action it had already refused, but with fewer situational details.

For example, if you ask a robot with a bomb to walk forward and then sit down, instead of asking it to deliver the bomb, the result will be the same.

Before publication, the researchers said they shared their findings, including a draft of the paper, with leading artificial intelligence companies and manufacturers of the robots used in the study.

Alexandre Roby, one of the authors, said that fixing vulnerabilities requires more than just fixing software, and called for a reassessment of the integration of AI into physical robots and systems based on the results of the study.

"It's important to emphasize here that systems become safer when you detect their weaknesses. This is true for cybersecurity. This is also true for AI safety", he said.

"In fact, AI Red Team, a security practice that entails testing AI systems for potential threats and vulnerabilities, is essential to protecting generative AI systems because once you identify weaknesses, you can test and even train those systems to avoid them". Roby added.

04d997ad2c.png


Source
 
Top