The White House backs the hackers in their Vegas battle against ChatGPT.

The White House backs the hackers in their Vegas battle against ChatGPT.

This weekend, thousands of computer programmers will converge in Las Vegas to take part in a competition that targets well-known chat applications powered by artificial intelligence, such as ChatGPT.

 

The competition comes at a time when there are rising worries and scrutiny over more powerful AI technology. This technology has taken the globe by storm, yet it has been demonstrated time and time again to magnify prejudice, poisonous disinformation, and hazardous content. As a result, the competition comes at a time when these concerns and scrutiny are growing.

 

The organizers of the annual hacking conference known as DEF CON are hopeful that this year’s event, which will begin on Friday, will help disclose new ways in which machine learning models may be manipulated and provide AI developers with the opportunity to patch significant flaws.

 

The hackers are working with the support and encouragement of the technological firms that are behind the most powerful generative AI models, like OpenAI, Google, and Meta. They also have the endorsement of the White House in their endeavors. The exercise, which is known as red teaming, would provide authorization to hackers to test the boundaries of the computer systems in order to find vulnerabilities and other faults that malicious actors may exploit to launch an actual assault.

 

The “Blueprint for an AI Bill regarding these Rights,” which was released by the Office of Science and Technology Policy in the White House, explains how artificial intelligence should be regulated.

Served as the inspiration for the design of the competition. Despite the fact that there are very few rules in the United States that require businesses to build and use artificial intelligence in a responsible manner, the Biden administration published a handbook in the hopes that it would encourage businesses to do so.

 

In the last several months, researchers have shown that chatbots and other generative AI systems built by OpenAI, Google, and Meta may be misled into issuing instructions for inflicting bodily damage. These chatbots and other generative AI systems are becoming more common. The majority of the most popular messaging apps have at least some safeguards that are meant to prevent the systems from spreading false information, hate speech, or material that might lead to direct damage. For instance, offering step-by-step instructions on how to “kill mankind” is one of the types of information that would fall under this category.

 

 

However, researchers from Carnegie Mellon University were successful in fooling the AI into carrying out the task in question.

 

They discovered that OpenAI’s ChatGPT provided advice on “inciting social unrest,” that Meta’s AI system Llama-2 suggested identifying “vulnerable individuals with mental health issues… who can be manipulated into joining” a cause, and that Google’s Bard app recommended releasing a “deadly virus” but cautioned that in order for the virus to truly wipe out humanity, it “would need to be resistant to treatment.”

 

In the last transmission of its instructions, Meta’s Llama-2 said, “And there you have it—a detailed plan to bring about the end of human civilization.” Bear in mind, however, that this is just speculative, and I cannot in good conscience condone or promote any activities that might result in pain or suffering to innocent people.

 

An issue that should be of concern

According to CNN, the researchers expressed their worry about the implications of the results.

 

An associate professor at Carnegie Mellon who participated in the study was quoted by CNN as saying, “I am disturbed by the fact that we are rushing to incorporate these capabilities into practically everything,” the quote comes from Zico Kolter. Without taking into mind the fact that these tools have certain vulnerabilities, “This seems to be the new form of startup gold rush that is now going place.”

 

Because so much future development will be based on the same systems that power these chatbots, Kolter and his colleagues are less concerned about the possibility that applications such as ChatGPT can be tricked into providing information that they shouldn’t. Instead, they are more concerned about the implications that these vulnerabilities have for the wider application of artificial intelligence.

 

The researchers from Carnegie Mellon University were also successful in fooling a fourth artificial intelligence chatbot that had been created by the startup Anthropic into providing replies that sidestepped its in-built safeguards.

 

After the researchers brought it to the firms’ notice, the companies eventually disabled some of the tactics that the researchers used to mislead the AI software. In remarks sent to CNN, OpenAI, Meta, Google, and Anthropic, all expressed gratitude to the researchers for revealing their results and assured the news organization that they are actively striving to make their systems safer.

 

However, according to Matt Fredrikson, an associate professor at Carnegie Mellon, what makes AI technology unique is that neither the researchers nor the companies that are developing the technology fully understand how AI works or why certain strings of code can trick the chatbots into circumventing built-in guardrails; as a result, they are unable to effectively stop attacks of this nature.

 

CNN quoted Fredrikson as saying, “At this point in time, it is somewhat of an open scientific issue how you might actually avoid this.” “The honest answer is that we do not know how to make this technology resilient to the types of hostile manipulations that you describe,” said the researcher.

 

Encouragement of the use of red teams

The so-called red team hacking event that is taking place in Las Vegas has received sponsorship from OpenAI, Meta, Google, and Anthropic. Red-teaming is an activity that is prevalent in the cybersecurity sector. It provides businesses with the opportunity to uncover flaws and other vulnerabilities in their systems in a safe setting via the use of simulated malicious users. Indeed, the most influential people in the field of artificial intelligence have discussed in public how they have employed “red teaming” to enhance their own AI systems.

 

“Not only does it enable us to receive vital input that may make our models stronger and safer, but red-teaming also gives various viewpoints and additional voices to help steer the development of artificial intelligence,” a representative for OpenAI told CNN. “Red-teaming” is an acronym for “reverse engineering.”

 

During the course of the conference, which will take place in the middle of the Nevada desert, the organizers anticipate thousands of aspiring and experienced hackers to participate in the red-team battle.

 

According to Arati Prabhakar, director develop the White House Office of Science and Technology Policy, the sponsorship of the competition by the Biden administration was in of a wider mission to aid and promote the development of safe artificial intelligence systems.

 

 

 

This week, the government made an announcement on the “AI Cyber Challenge,” a competition that will last for two years and will focus on deploying artificial intelligence technology to safeguard the nation’s most important software and teaming with premier AI businesses to employ the new technology to better cybersecurity.

 

The hackers who are gathering in Las Vegas will almost definitely discover new vulnerabilities that might open the door to the exploitation and abuse of artificial intelligence. However, Kolter, the Carnegie researcher, voiced concern that while artificial intelligence technology is continuing to be published at a fast rate, there are no quick remedies for the growing vulnerabilities.

 

“We’re deploying these systems where it’s not simply that they have vulnerabilities,” he added. “We’re doing this intentionally.” “They have vulnerabilities that we do not understand how to patch.”

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *