GPT-3, the large neural network created with extensive training using huge data sets, provides a variety of benefits to cybersecurity applications, including natural language-based threat hunting, easier categorization of unwanted content, and clearer explanations of complex or obfuscated malware , according to research to be presented at the Black Hat USA conference next week.
Using the third version of the Generative Pre-trained Transformer — better known as GPT-3 — two researchers at cybersecurity company Sophos found that the technology could convert natural language queries, such as “show me all the word processing software that makes outbound connections to servers in South Asia” in requests to a Security Information and Event Management System (SIEM). GPT-3 is also very good at taking a small number of examples of website rankings and then using them to categorize other sites, find matches between criminal sites or between exploit forums.
Both uses of GPT-3 could save companies and cybersecurity analysts a lot of time, said Joshua Saxe, one of the two authors of the Black Hat study and chief scientist for artificial intelligence at Sophos.
“We’re not using GPT-3 in production right now, but I see GPT-3 and big deep learning models — the ones you can’t build on base hardware — I see those models as important for strategic cyber defense,” he says. . “We get much better – dramatically better – results with a GPT-3-based approach than we would get with traditional approaches with smaller models.”
The study is the latest application of GPT-3 to demonstrate the model’s surprising effectiveness in translating natural language queries into machine commands, program code, and images. For example, the creator of GPT-3, OpenAI, has teamed up with GitHub to create an automated pair programming system, Copilot, that can generate code based on natural language comments and simple function names.
GPT-3 is a generative neural network that leverages the ability of deep learning algorithms to recognize patterns to feed results back to a second neural network that creates content. For example, a machine learning image recognition system could rank results from a second neural network used to convert text into original art. By making the feedback loop automatic, the machine learning model can quickly create new artificial intelligence systems, such as the art-producing DALL-E.
The technology is so effective that an AI researcher at Google claimed that one implementation of a large-language chatbot model has become sensitive.
While the nuanced understanding of the GPT-3 model surprised Sophos researchers, they are much more focused on the technology’s utility in facilitating the work of cybersecurity analysts and malware researchers. In their upcoming presentation at Black Hat, Saxe and fellow Sophos research Younghoo Lee will show how the largest neural networks can yield useful and surprising results.
In addition to creating searches to detect threats and classify websites, the Sophos researchers used generative training to improve the performance of the GPT-3 model for specific cybersecurity tasks. For example, the researchers took an obfuscated and complicated PowerShell script, translated it with GPT-3 with different parameters, and compared its functionality with the original script. The configuration that translates the original script closest to the original is considered the best solution and is then used for further training.
“GPT-3 does about as well as the traditional models, but with a small handful of training examples,” says Saxe.
Companies have invested in artificial intelligence and machine learning as essential to improve technology efficiency, with “AI/ML” becoming an indispensable term in product marketing.
But ways to exploit AI/ML models have jumped from whiteboard theory to practical attacks. Government contractor MITER and a group of tech companies have created an encyclopedia of hostile attacks on artificial intelligence systems. Known as the Adversarial Threat Landscape for Artificial-Intelligence Systems, or ATLAS, the classification of techniques includes misusing real-time learning to poison training data, such as what happened to Microsoft’s Tay chatbot, to evading the capabilities of the machine learning model, like what researchers did with Cylance’s malware detection engine.
Ultimately, artificial intelligence probably has more to offer defenders than attackers, Saxe says. But while the technology is worth using, it won’t drastically change the balance between attackers and defenders, he says.
“The overall aim of the conversation is to convince people that these big language models are not just hype, they are real, and we need to find where they fit in our cybersecurity toolbox,” says Saxe.