A team of researchers affiliated with the Center for Long-Term Cybersecurity have released a resource to help identify and mitigate the risks and potentially harmful impacts of general-purpose artificial intelligence (AI) systems (GPAIS) such as GPT-4 (the large language model used by ChatGPT and other applications) and DALL-E 3, which is used to generate images based on text prompts.
The AI Risk-Management Standards Profile for General-Purpose AI Systems (GPAIS) and Foundation Models (Version 1.0) is aimed primarily at developers of large-scale, state-of-the-art AI systems that “can provide many beneficial capabilities but also risks of adverse events with profound consequences,” the authors explain in the report’s abstract. “This document provides risk-management practices or controls for identifying, analyzing, and mitigating risks of GPAIS.”
The Profile was developed by Anthony M. Barrett, a researcher affiliated with UC Berkeley’s AI Security Initiative (AISI) at the UC Berkeley Center for Long-Term Cybersecurity, along with Jessica Newman, Director of the AISI; Brandie Nonnecke, Director of the CITRIS Policy Lab at UC Berkeley; Dan Hendrycks, a recent UC Berkeley PhD graduate; and Evan R. Murphy and Krystal Jackson, non-resident research fellows with the AISI.
The Profile is part of a growing body of resources intended to mitigate the risks of AI systems, which introduce novel privacy, security, and equity concerns and can be used for malicious purposes. Large-scale, cutting-edge GPAIS in particular have potential to behave unpredictably, manipulate or deceive humans in harmful ways, or lead to severe or catastrophic consequences for society. The Profile aims to ensure that developers of such systems take appropriate measures to anticipate and plan for a wide range of potential harms, from racial bias and environmental harms to destruction of critical infrastructure and degradation of democratic institutions.
The Profile is tailored to complement other AI risk management standards, such as the NIST AI Risk Management Framework (AI RMF), developed by the National Institute of Standards and Technology (NIST), and ISO/IEC 23894, developed by the International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC).
The Profile provides guidelines for GPAIS developers based on “core functions” defined in the NIST AI RMF: “Govern,” for AI risk management process policies, roles, and responsibilities; “Map,” for identifying AI risks in context; “Measure,” for rating AI trustworthiness characteristics; and “Manage,” for decisions on prioritizing, avoiding, mitigating, or accepting AI risks.
A Resource for Developers of GPAIS and Foundation Models
Other initial AI RMF profiles have seemed likely to focus on specific industry sectors and end-use applications, e.g., in critical infrastructure or other high-risk categories of the draft EU AI Act. While valuable for downstream developers of end-use applications, an approach focused on end-use applications could overlook an opportunity to provide profile guidance for upstream developers of increasingly general-purpose AI, including AI systems sometimes referred to as “foundation models.” Such AI systems can have many uses, and early-development risk issues such as emergent properties that upstream developers are often in a better position to address than downstream developers building on AI platforms for specific end-use applications.
“This document can provide GPAIS deployers, evaluators, and regulators with information useful for evaluating the extent to which developers of such AI systems have followed relevant best practices,” the authors write. “Widespread norms for using best practices such as in this Profile can help ensure developers of GPAIS can be competitive without compromising on practices for AI safety, security, accountability, and related issues.”
The guidance is for developers of large-scale GPAIS or “foundation models,” such as GPT-4, Claude 2, PaLM 2, LLaMA 2, among others, as well as “frontier models,” cutting-edge, state-of-the-art, or highly capable GPAIS or foundation models. The Profile was developed over the course of one year with extensive feedback from more than 100 participants in virtual workshops and interviews.
Version 1.0 released today follows two earlier draft versions that were made publicly available for additional feedback. The report’s appendices includes a “feasibility test,” in which the researchers applied the guidelines to four relatively large-scale foundation models — GPT-4, Claude 2, PaLM 2, and Llama 2 — based on publicly available information. The Berkeley GPAIS and foundation model profile effort is separate from, but aims to complement and inform the work of, other guidance development efforts such as the PAI Guidance for Safe Foundation Model Deployment and the NIST Generative AI Public Working Group.
“Ultimately, this Profile aims to help key actors in the value chains of increasingly general-purpose AI systems to achieve outcomes of maximizing benefits, and minimizing negative impacts, to individuals, communities, organizations, society, and the planet,” the authors write. “That includes protection of human rights, minimization of negative environmental impacts, and prevention of adverse events with systemic or catastrophic consequences at societal scale.”
For more information, email Tony Barrett at anthony.barrett@berkeley.edu.
AI Risk-Management Standards Profile for General-Purpose AI Systems (GPAIS) and Foundation Models (version 1.0)
Version History
For version history and comparison, following are earlier publicly available draft documents:
- Second Full Draft (August 28th, 2023, 174 pp, Google Doc format) – or choose pdf format.
- First Full Draft (May 10th, 2023, 128 pp, Google Doc format) – or choose pdf format.
Policy Brief
See a policy brief related to the standards, published on September 27, 2023.
What People Are Saying About the Profile
“It’s excellent to see the first NIST AI Risk Management Framework profile on general purpose AI systems being developed after a year-long effort by the Berkeley Center for Long-Term Cybersecurity. The AI RMF was a very inclusive, thorough process that provides crucial guidance to AI stakeholders on responsible AI and developing a culture of risk management and adequate processes.” – Alexandra Belias, Head of Product Policy and Partnerships, Google DeepMind
“The Federation of American Scientists appreciates the efforts by the Center for Long-Term Cybersecurity and all those involved in developing the AI Risk-Management Standards Profile for General-Purpose AI Systems and Foundation Models. We note their use of an iterative process to develop broader guidance, utilizing feedback from a diverse range of stakeholders. We appreciate their intention to keep the profile updated, and recognize their emphasis on mitigating potential societal impacts and its suggestions for extensive red-teaming and incremental scaling.” – Divyansh Kaushik , Associate Director for Emerging Technologies and National Security, Federation of American Scientists
“Credo AI applauds the multi-stakeholder initiative led by Berkeley to create a tailored NIST AI risk management framework (RMF) Profile for general purpose AI systems (GPAI). The unique risks and opportunities of the technology, as well as the complex “stack” of GPAI ecosystem actors makes the additional guidance in the GPAI RMF profile timely and necessary. As a company building governance software to operationalize AI risk management, we know how critical standardization and clear allocation of responsibilities is to effectively governing the development, deployment and use of impactful AI systems. This profile maintains consistency with the important framework that NIST defined in the base AI RMF, while providing valuable direction specific to GPAI across the AI lifecycle. Overall, this tailored profile represents a critical step toward establishing norms and standards for trustworthy and responsible general purpose AI, and we are proud to have contributed to this important work.” – Ian Eisenberg, Head of AI Governance Research, Credo AI
“I appreciate the generative and inclusive process adopted by CLTC in developing its RMF for GPAIs and Foundation Models. I would also like to emphasize that continued work is required to ensure the implementation of the Blueprint for an AI Bill of Rights and fundamental rights impact assessments in go/no-go decisions. I was happy to contribute to this profile and look forward to working with CLTC in strengthening human rights based impact assessments in frameworks which advance safeguards and mitigate risks of emerging AI systems.” – Christabel Randolph, Public Interest Fellow, Georgetown University Law Center
“I am excited about the release of AI Risk-Management Standards Profile for General-Purpose AI Systems (GPAIS) and Foundation Models. The effort took over a year, involving people and organizations from multiple nationalities and various backgrounds. As a result, it provides detailed and thorough guidance for performing risk management on these high-impact AIs. The work is valuable and timely. As the capabilities of AI systems increase, society has become aware of the current and future harms they can cause. Domestic and international governments are establishing requirements, policies, and guidance for managing the risk of these systems. Thus, this document helps address an imminent need, providing developers and adopters with a resource for implementing the risk management that society and governments want.” – Heather Frase, PhD, Senior Fellow, Center for Security and Emerging Technology (CSET)
“The Future of Life Institute (FLI) applauds this document, which demonstrates the critical role that civil society, academia, and the private sector can collaboratively play to mitigate the societal risks of AI. This contribution from the CLTC significantly advances the discussion around the governance of general purpose AI systems and foundation models, within the context of the NIST AI risk management framework. As a profile, it provides organizations with critical, tailored guidance on how to identify, understand, and potentially address the impact of systems lacking a fixed purpose. We urgently need such guidance if we are to attempt to prevent the large-scale threats and ongoing harms presented by increasingly powerful and unregulated systems, which could irreversibly derail society or progress by accident or misuse. FLI will continue to wholeheartedly support such efforts to steer transformative technology for the benefit of humanity.” – Future of Life Institute
“On behalf of Preamble, we express our appreciation to UC Berkeley CLTC for spearheading the creation of the AI Risk-Management Standards Profile for General-Purpose AI Systems (GPAIS) and Foundation Models profile for the NIST AI Risk Management Framework (AI RMF 1.0). The contributions from diverse industry and academic leaders underscore the framework’s alignment with our mission to nurture safe and secure AI systems. Its lifecycle-centric methodology for navigating AI risks, the accent on a collaborative, multi-stakeholder developmental journey, and the imperative focus on pivotal issues like security, privacy, and governability echo our organizational objectives. The AI risk-management standards profile emerges as a solid benchmark propelling responsible AI governance, an essential stride towards engendering trustworthy AI architectures. Having been the original discoverers of the Prompt Injection vulnerability in Language Models (LLMs), our engagement in furthering this framework comes from a place of experienced understanding and deep commitment to AI safety. We eagerly anticipate engaging with UC Berkeley CLTC and other stakeholders in championing and actualizing this framework to augment the responsible AI discourse, harmonizing with Preamble’s dedication to advancing the mission of safe and secure AI systems.” – Jeremy McHugh and Jonathan Rodriguez Cefalu, Co-Founders of Preamble
“The Centre for the Governance of AI (GovAI) recognizes the dire need for risk management guidance in the rapidly evolving space of risk from frontier AI models. We are therefore highly pleased to see the launch of v1.0 of the AI Risk-Management Standards Profile for General-Purpose AI Systems (GPAIS), Foundation Models and Generative AI, which forms a crucial contribution to this emerging field. We would like to congratulate Anthony Barrett, Jessica Newman, and Brandie Nonnecke on all their hard work and multi-stakeholder engagement that has come to fruition in this comprehensive yet actionable profile. GovAI has been an active participant in the process of creating the profile and are pleased to see the profile’s emphasis on responsible deployment methods, applying risk tolerance, and having a wide set of diverse controls. We recommend all relevant AI developers to make use of this profile in order to ensure their decisions are informed by both present-day and future risk from the development and deployment of frontier AI models.” – Centre for the Governance of AI (GovAI)