To Blog

Interrogating RoBERTa: A Case Study in Cybersecurity

Oct. 20, 2022

What to do when your AI audit turns up a security vulnerability that might already be affecting millions?

IQT Labs has experimented with how to audit artificial intelligence systems, with the end goal of helping users and developers better understand these algorithms that run our lives. In our most recent audit, the Labs interrogated a Large Language Model released by Facebook in 2019. Unexpectedly, this audit didn’t just turn up the biases and ethical quandaries the team suspected might exist—it also identified a previously unknown security hole in a popular software actively used by millions of people worldwide. This case study details that discovery and the steps leading to its resolution seven weeks later to guide other teams that might find themselves in a similar circumstance.


When IQT Labs Senior Software Engineer Ryan Ashley sat down to map out the cybersecurity portion of the Labs’ recent RoBERTa audit, his first step was to decide what he would audit. As our first blog on this audit series discussed, RoBERTa, or Robustly Optimized BERT Approach, is a language-based AI, but not a stand-alone tool. So auditing RoBERTa for security concerns also meant identifying the most likely downstream platforms that might be used with it, and auditing those. For this purpose, Ashley settled on Jupyter Notebook, a popular open-source, browser-based computational ‘lab notebook’ used by millions, including IQT Labs’ own data scientists.

Right around that time, word of a ransomware attack targeting Jupyter Notebooks was making the rounds, just a year after a cross-site scripting (XSS) attack made news for the same reason. Ashley figured he might kick off his audit by reenacting those known vulnerabilities to illustrate how the development stage of a RoBERTa-based tool might be compromised. “It’s information that we can provide as part of our vulnerability assessment,” he explains. “I wanted to show how, ‘Hey, if you’re using this old version, here’s a way that this successful vulnerability can turn into remote code execution.’”

Or at least, that was the plan. On April 25, Ashley began his replication of the 2021 XSS attack, intending to inject a piece of malicious code into a test environment running Jupyter Notebook in IQT Labs, and thereby gain access to the device.

But it didn’t work. As he launched the attack, Ashley kept finding that while he could read the sensitive files, he couldn’t seem to modify them the way he should.

He began fiddling with the system, trying different tactics to see what might work. Eventually he realized that if he added a critical string of characters, called an ‘authorization token,’ into the request, then like magic, he could modify all the files he wanted. That the token had this effect wasn’t the security vulnerability. The role of a token is literally to open the digital doors—it’s essentially a randomly generated password by the program. But Ashley’s next question was: How well was that token secured?

At this point, Ashley set aside the XSS attack and focused entirely on this new line of questioning. The token, it turned out, was kept in plain-text in certain files that Jupyter Notebook marked as ‘hidden’—as in, if someone requested Notebook to present a list of all its files, the file with the token, and any other hidden directories (essentially folders) wouldn’t be on the list.

However, each of the individual files inside of the directories had identifying numbers on them. If Ashley used that identifying number to request the token file by name, Jupyter Notebook would freely hand it over, plain-text password and all, even if it was in a hidden directory. Thatwas the security issue. If the file was in a hidden directory, it shouldn’t ever be provided by the program, no matter how specific the request.

“The core problem is that you are able to interact with those hidden files when you shouldn’t be,” Ashley explains, “The user there has a reasonable expectation of privacy because they don’t see that file on the list. If they’re using it legitimately, they shouldn’t be able to interact with that file.”

Jupyter Notebook doesn’t run quite like a regular program on a laptop; it’s not sitting on a device all the time. Rather, every time a user starts a Notebook up on their browser of choice, Jupyter sets up a little local server in a directory somewhere on the computer and runs from that. If that directory happened to contain sensitive files, like say a user’s ‘home’ folder does, then those files would be readable and writable by any authenticated user of the system–regardless of whether that was an intended user or someone who’d found their way in illegitimately. Depending on how and where the notebook server is started, malicious users might be able to steal or modify credentials, or to add instructions to sensitive system files. Not every user will have configured their Jupyter Notebook in such a way as to be open to this hack, Ashley realized, but if even just a small fraction of the user base was susceptible, that still meant a fraction of millions.

To make his point, Ashley set up a simple brute-force hack to scrape the authorization code in the test environment, attached that to the original XSS attack, and broke into the Jupyter notebook. There he set up a remote presence to impersonate an actual authorized user, gaining access to all the potentially sensitive files. From discovery to successful execution, the whole process only took about three days. 

Next Steps

Finding this flaw in the hidden files was not Ashley’s first experience with bug discovery. The year before, while working on a project focusing on 5G networks he and his collaborator Charlie Lewis, Principal Software Engineer at IQT Labs, had discovered an even more critical security hole in a different tool. In resolving that, they wound up sketching out IQT Labs’ official protocol to follow whenever similar cybersecurity issues needed to be disclosed.

The IQT Labs protocol adheres closely to general industry standards. It starts with first, validating the issue and creating a proof-of-concept demonstration if applicable; second, contacting the software developers responsible for the product; and third, filing for a CVE number—or a public Common Vulnerabilities and Exposures number on a database maintained by the National Institute of Standards and Technology. If the team wants to, the fourth step involves developing a potential fix for the problem. The fifth and final step is the most important: maintaining an embargo on speaking publicly about the finding for 60-90 days, to give time for a patch to be fully developed and released. To do otherwise before remediation is to risk giving malevolent actors an unmitigated opportunity for harm.

Ashley spent a week confirming his discovery and preparing the proof-of-concept, then contacted the Jupyter developers via the security contact process they maintain. They confirmed his findings and launched two security advisories, one for Jupyter Notebook, and another for the related Jupyter Server which was found to share the problem. As a part of the security advisories, Github assigned a set of CVE numbers and independently confirmed the issue.

At this point, Ashley could have concluded his active role and simply maintained the embargo on discussing the issue. Developing a patch isn’t an essential step: in fact, many professionals participate in “Bug Bounty” programs, where they’re paid just to find and disclose bugs, while the program’s sponsor takes responsibility for actually fixing the problems.

“But I’m an engineer,” Ashley says of himself, “So if there’s a problem, it makes my fingers itch.”

He began a root cause analysis to discover the source of the problem. In this case, that meant quite literally running through the source code line by line until he found the weak point. It turned out that it was a matter of checks: Jupyter Notebook and Jupyter Server both had ‘checks’ built around their directories, so that they would literally just check whether a directory was supposed to be hidden before listing its contents. If a file or folder was ‘hidden’, it would not be displayed on a list request to its parent. But Jupyter didn’t have equivalent checks on files contained within those hidden folders, which is why Ashley was able to pull up critical hidden files when he requested them by known or guessable names.

The final patch, designed by Ashley with feedback from the Jupyter maintainers, added those necessary checks to the individual files. With the patches in place now, if a user or attacker attempts to call up hidden files by name or modify them in any way—including writing on or deleting them—the new checks require Jupyter notebook to simply return an error message.

Right Time, Right Place

Later, after the patches and the problem solving, Ashley went back to look at his work. It turned out, that when he’d tried that first failed XSS attack, the one that led him down the rabbit hole to the hidden files flaw—he realized he’d been doing it wrong. He’d actually been inadvertently running the attack incorrectly the whole time. The entire discovery was a fluke. “If I had done that XSS attack properly, it would’ve just overwritten the file the way I thought it would, and I never would’ve noticed this,” he says.

That’s the risk in these fields. Teams that run AI audits, and even teams working in AI more generally can discover cybersecurity holes that need attention. “You don’t necessarily have to be a search-and-rescue expert to find a lost hiker,” Ashley says. “You just have to be in the right place at the right time to see that something is wrong, and that the wrong thing has security ramifications.”

This case study, and the more technical details in the forthcoming IQT Labs’ full RoBERTa audit report, are meant to serve as a rough guide and illustration of the resolution process for those teams who find themselves in that right place and right time and need to know the steps to move forward as safely and responsibly as possible.

IQT Blog

Insights & Thought Leadership from IQT

Read More