White Paper / January 2026

Toward Risk Thresholds for AI-Enabled Cyber Threats: Enhancing Decision-Making Under Uncertainty with Bayesian Networks

cover of cyber thresholds report, showing glowing dots on a surface
Download the report

Artificial intelligence (AI) is increasingly being used to augment and automate cyber operations, altering the scale, speed, and accessibility of malicious activity. For example, AI can be used to automate multi-stage intrusions, enable the discovery of zero-day vulnerabilities, or lower the expertise required for sophisticated attacks. These shifts raise urgent questions about when AI systems introduce unacceptable or intolerable cyber risk, and how risk thresholds should be identified before harms materialize at scale.

In a new report, Toward Risk Thresholds for AI-Enabled Cyber Threats: Enhancing Decision-Making Under Uncertainty with Bayesian Networks — a team of researchers with the Center for Long-Term Cybersecurity’s Artificial Intelligence Security Initiative (AISI) — Krystal Jackson, Deepika Raman, Jessica Newman, Nada Madkour, Charlotte Yuan, and Evan R. Murphy — propose a structured approach for developing and evaluating AI cyber risk thresholds. Their approach relies on the use of Bayesian networks (BNs), a type of probabilistic modeling tool that can help determine thresholds by integrating a wide range of information about both the world and AI systems.

In recent years, industry, government, and civil society actors have begun to articulate… thresholds for advanced AI systems, with the goal of signaling when models meaningfully amplify cyber threats,” the authors explain in the report’s introduction. “However, current approaches to determine these thresholds remain fragmented and limited. Many thresholds rely solely on capability benchmarks or narrow threat scenarios, and are weakly connected to empirical evidence…. Rather than proposing a single definitive threshold, this work offers a practical pathway for transforming high-level risk concerns into measurable, monitorable indicators that can inform deployment, mitigation, and oversight decisions as AI-enabled cyber risks evolve.”

In the paper, the authors provide an overview of existing approaches used to determine cyber thresholds, and they identify diverse shortcomings of current methodologies. They then introduce how Bayesian networks can be used a tool for modeling AI-enabled cyber risk, and for “enabling the integration of heterogeneous evidence, explicit representation of uncertainty, and continuous updating as new information emerges.”

The paper includes a focused case study on AI-augmented phishing that demonstrates how cyber threats can be monitored through the use of Bayesian networks. “By focusing specifically on the phishing risk subdomain, we illustrate how qualitative findings can be translated into variables and supported with quantitative data, and we connect this analysis to identifying and monitoring specific risk pathways,” the authors write.

The paper builds on the authors’ previous paper, Intolerable Risk Threshold Recommendations for Artificial Intelligence, which presented recommendations for organizations and governments engaged in establishing thresholds for intolerable AI risks. The new paper focuses on complementing existing methods of determining thresholds, which are generally focused on assessments of a model’s capability, with integration of broader evaluation criteria. “A central aim of this work is to bridge the gap between high-level threshold statements and operational decision-making,” the authors explain.

A New Approach to Thresholds


Illustrative example of what a Bayesian network structure looks like when applied to the phishing sub-domain. 

As policymakers grapple with the risks of AI, three complementary frameworks have emerged to guide intervention, including the use of thresholds, which trigger a halt in model deployment or development if a capability or impact level is crossed; redlines, which mark categorically unacceptable activities; and tiered risk assessments, which prompt proportional responses. These structures form the backbone of “if-then” commitments, where governments and companies pledge specific actions once AI systems reach certain milestones.

However, current thresholds rely heavily on qualitative statements or anecdotal scenarios, leaving too much ambiguity in ascertaining what constitutes an unacceptable risk. Without structured methods for estimating likelihoods, assessing severity, and mapping how AI capabilities translate into concrete harms, thresholds risk becoming inconsistent, arbitrary, or unenforceable. The methodology introduced in the new paper moves beyond static capability descriptions and instead reflects the probabilistic, system-level nature of AI-enabled threats.

A central shortcoming of current AI thresholds is that they outline outcomes deemed intolerable but lack robust methods to measure them,” the authors write. “Developing credible thresholds will require moving beyond capability-centric indicators toward methods that quantify uncertainty, combine diverse evidence sources, and adapt as capabilities evolve. This paper identifies BNs as one practical route for decomposing high-level threshold concepts, integrating a wide variety of evidence, and updating risk estimates as both AI capabilities and threats evolve.

Ultimately, the development of robust risk thresholds will require sustained collaboration across model developers, evaluators, and oversight bodies,” the authors conclude. “Despite their limitations, BNs represent one possible path forward in the probabilistic modeling of AI risks. When strengthened with sufficient methodological research, they may serve as the most practical and tractable approach for developing defensible and measurable thresholds that can support clearer decision-making under uncertainty and improve our ability to anticipate and mitigate intolerable AI-enabled cyber risks.

Toward Risk Thresholds for AI-Enabled Cyber Threats: Enhancing Decision-Making Under Uncertainty with Bayesian Networks