White Paper / January 2025

AI Risk-Management Standards Profile for General-Purpose AI (GPAI) and Foundation Models

Report cover image for "AI Risk-Management Standards Profile for General-Purpose AI (GPAI) and Foundation Models"
Download the report

The first annual update to the AI Risk-Management Standards Profile for General-Purpose AI Systems (GPAIS) and Foundation Models (Version 1.1) has been published by a team of researchers affiliated with the Center for Long-Term Cybersecurity and its AI Security Initiative and the CITRIS Policy Lab. The Profile aims to be a resource to help identify and mitigate the risks and potential harmful impacts of GPAI and foundation models such as GPT-4o (the large language model used by ChatGPT and other applications).

The Profile Version 1.1 (Profile V1.1) is aimed primarily at developers of large-scale, state-of-the-art AI systems that “can provide many beneficial capabilities but also risks of adverse events with profound consequences,” the authors explain in the report’s abstract. “This document provides risk-management practices or controls for identifying, analyzing, and mitigating risks of GPAI/foundation models.”

Profile V1.1, released today, follows V1.0, and two earlier draft versions made publicly available for additional feedback.

The Profile V1.1 update was developed by Anthony M. Barrett, Jessica Newman, Nada Madkour, Brandie Nonnecke, Dan Hendrycks, Evan R. Murphy, Krystal Jackson, and Deepika Raman. The Profile was further informed by feedback from more than 100 people through a series of consultations and a workshop held between May 2023 and September 2024.

Profile V1.1  is part of a growing body of resources intended to identify and mitigate the risks of AI systems, which introduce novel privacy, security, and equity concerns and other malicious purposes. Large-scale, cutting-edge GPAI/foundation models have the potential to behave unpredictably, manipulate or deceive humans in harmful ways, or lead to severe or catastrophic consequences. The Profile V1.1 aims to ensure that developers of such systems take appropriate measures to anticipate and plan for a wide range of potential harms, from racial bias and environmental harms to destruction of critical infrastructure and degradation of democratic institutions.

The Profile V1.1 is tailored to complement other AI risk management standards, such as the NIST AI Risk Management Framework (AI RMF), developed by the National Institute of Standards and Technology (NIST), and ISO/IEC 23894, developed by the International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC).

The Profile V1.1 provides guidelines for GPAI/foundation model developers based on “core functions” defined in the NIST AI RMF: “Govern,” for AI risk management process policies, roles, and responsibilities; “Map,” for identifying AI risks in context; “Measure,” for rating AI trustworthiness characteristics; and “Manage,” for decisions on prioritizing, avoiding, mitigating, or accepting AI risks.

A Resource for Developers of GPAI and Foundation Models

Other initial AI RMF profiles have focused on specific industry sectors and end-use applications, e.g., in critical infrastructure or other high-risk categories of the EU AI Act. While valuable for downstream developers of end-use applications, an approach focused on end-use applications could overlook an opportunity to provide profile guidance for upstream developers of increasingly general-purpose AI, including AI systems sometimes referred to as “foundation models.” Such AI systems can have many uses, and early-development risk issues such as emergent properties that upstream developers are often in a better position to address than downstream developers building on AI platforms for specific end-use applications.

“This document can provide GPAI/foundation model deployers, evaluators, and regulators with information useful for evaluating the extent to which developers of such AI systems have followed relevant best practices,” the authors write. “Widespread norms for using best practices such as in this Profile can help ensure developers of GPAI/foundation models can be competitive without compromising on practices for AI safety, security, accountability, and related issues.”

The guidance is for developers of large-scale GPAI/foundation models such as GPT-4o, Claude 3.5 Sonnet, Gemini 1.5, LLaMA 3.1, among others, as well as “frontier models,” cutting-edge, state-of-the-art, or highly capable GPAI/foundation models.

Changes between the Version 1.1 Profile and the Version 1.0 Profile include:

Terminology and scope refinements throughout this document

  • Most notable is that most instances of “general purpose AI systems (GPAIS)” were changed to “GPAI/foundation models” to better reflect our greater relative focus on upstream GPAI models and foundation models, rather than downstream AI systems incorporating GPAI/foundation models.

Additional resources provided for:

  • Red teaming and benchmark capability evaluations (Measure 1.1)
  • Transparency (Measure 2.9) and documentation (Measure 3.1)
  • Governance and policy tracking (Govern 1.1)
  • Training data audits (Manage 1.3, Measure 2.8)
  • Model weight protection (Measure 2.7)
  • Added actions and resources from the NIST Generative AI Profile, NIST AI 600-1, released July 2024

Expansion on risks:

  • Manipulation and deception (Map 5.1)
  • Sandbagging during hazardous-capabilities evaluations (Govern 2.1, Map 5.1)
  • Situational awareness (Map 5.1)
  • Socioeconomic and labor market disruption (Map 5.1)
  • Possible intractability of removing backdoors (Map 5.1, Measure 2.7)

In Roadmap in Appendix 3, updates on issues to address in future versions of Profile:

  • Interpretability and explainability methods appropriate for architectures and scales of LLMs and other GPAI/foundation models
  • Agentic AI systems

New supporting documentation:

For more information, email Nada Madkour at nada.madkour@berkeley.edu or Jessica Newman at jessica.newman@berkeley.edu.

AI Risk-Management Standards Profile for General-Purpose AI (GPAI) and Foundation Models V1.1

Profile Quick Guide

Retrospective Test Use of Profile Guidance

Mapping of Profile Guidance V1.1 to Key Standards and Regulations