The U.S. National Institute of Standards and Technology (NIST) has released an updated version of its Artificial Intelligence (AI) Risk Management Framework supplement, expanding its taxonomy of threats, attacker models, and mitigation strategies to better equip developers, regulators, and users of AI systems in an evolving threat landscape.
The March 2025 update builds upon the foundational “AI Risk Management Framework (RMF)” and reflects extensive input from industry, academia, and federal agencies. It is intended to help organizations understand, categorize, and mitigate risks associated with both predictive AI and generative AI systems, which are increasingly deployed in high-stakes domains such as healthcare, finance, and critical infrastructure.
At the heart of the update is a more granular breakdown of adversarial threats, including taxonomy entries for poisoning attacks, evasion attacks, and training data extraction—vulnerabilities that are increasingly relevant given the widespread adoption of large language models (LLMs) and other generative AI tools. Notably, NIST provides detailed descriptions of threats like “targeted poisoning,” “side-channel attacks,” and “Trojan” model manipulation, offering actionable guidance for developers to identify and counter these risks.
A new emphasis is placed on the concept of “attacker capabilities,” including access to training, testing, or deployment stages of machine learning pipelines, and “attacker knowledge,” such as awareness of model architecture or defense mechanisms. These distinctions are designed to improve modeling of real-world threats in both white-box and black-box environments.
The document also addresses emerging attack vectors specific to generative AI, such as training data extraction via prompt injection and misuse of “system prompts,” which can inadvertently leak high-trust instructions embedded in LLMs.
Another notable addition is a classification system for attacker goals—ranging from “integrity” attacks (e.g., generating false outputs), to “confidentiality” breaches (e.g., leaking training data), and “availability” disruptions (e.g., denial-of-service). Each goal is mapped to typical attack types and mitigation strategies, enabling a structured approach to threat modeling.
To address these concerns, NIST expands its guidance on real-world mitigations, recommending layered defense strategies such as adversarial training, robust input validation, system-level access controls, and metadata-based content provenance. The framework also encourages use of digital watermarking for generative outputs—a technique increasingly adopted by leading AI developers as a safeguard against misinformation and misuse.
The release comes at a critical time, as U.S. policymakers, industry leaders, and international partners continue to debate and shape the future of AI regulation. NIST’s framework has become a foundational reference point in the development of trustworthy AI practices, particularly for government agencies and regulated sectors.
As global regulators from the European Union to China introduce binding AI laws, the U.S. is leaning heavily on frameworks like NIST’s to support a sector-led but safety-focused approach. The AI supplement thus not only provides technical rigor but also reflects a broader effort to ensure that AI systems serve the public good while remaining resilient to real-world threats.
Need Help?
If you have questions or concerns about how to navigate the U.S. or global AI regulatory landscape, don’t hesitate to reach out to BABL AI. Their Audit Experts can offer valuable insight, and ensure you’re informed and compliant.