To Blog

IQT Labs releases audit report of RoBERTa, an open source large language model

Dec. 16, 2022

We are excited to announce the release of our latest AI audit report, which focuses on a pretrained large language model called RoBERTa. Check out the full report here.

Large Language Models (LLMs) like RoBERTa have been in the press a lot recently, thanks to OpenAI’s release of ChatGPT. These models are tremendously powerful, but also concerning, in part because of their potential to generate offensive, stereotyped, and racist text. Since LLMs are trained on extremely large text datasets scraped from the internet, it is difficult to know how they will perform in a specific context, or to anticipate undesirable biases in model output.

Our report describes several concerns we uncovered while auditing this model, including:

We also describe how we did the audit, building on the methodology we developed while auditing the deepfake detection tool, FakeFinder. For example:

  • We mined the AI Incident Database for previous failures of LLMs and used these incidents to help us construct an ethical matrix for RoBERTa.
  • We worked with BNH.AI to define general categories of bias and create a high-level bias testing plan.

For more information on IQT Labs AI Audits, check out Interrogating RoBERTa: Inside the challenge of learning to audit AI models and tools or get in touch with us by emailing labsinfo@iqt.org.

IQT Blog

Insights & Thought Leadership from IQT

Read More