Report Finds Gaps in AI Explainability Testing, Calls for Stronger Evaluation Standards

Written by Jeremy Werner

Jeremy is an experienced journalist, skilled communicator, and constant learner with a passion for storytelling and a track record of crafting compelling narratives. He has a diverse background in broadcast journalism, AI, public relations, data science, and social media management.
Posted on 03/20/2025
In News

A new report from the Center for Security and Emerging Technology (CSET) highlights significant shortcomings in how AI systems, particularly recommendation algorithms, are evaluated for explainability and interpretability. The study underscores the need for clearer definitions, standardized evaluation methods, and stronger regulatory oversight to ensure AI systems function as intended.

 

The report, titled Putting Explainable AI to the Test, examines how researchers assess AI systems’ ability to provide human-understandable explanations for their outputs. The findings reveal inconsistencies in how explainability is defined and measured, raising concerns about the reliability of AI evaluations.

 

“Explainability and interpretability are often cited as key principles for responsible AI, but our research shows that evaluations of these principles vary widely,” the report states. “Without clear standards, policymakers and developers may struggle to ensure AI systems operate safely and transparently.”

 

The study identifies five common evaluation approaches used by researchers: case studies, comparative evaluations, parameter tuning, surveys, and operational evaluations. Among these, case studies and comparative evaluations are the most prevalent, appearing in 88% and 63% of reviewed research papers, respectively. These methods primarily focus on assessing whether AI systems function as designed, rather than whether their explanations are useful or understandable to human users.

 

A major gap, according to the report, is the lack of emphasis on testing the real-world effectiveness of AI explanations. Surveys and operational evaluations, which measure how well AI explanations help users make decisions, were far less common—appearing in only 19% and 4% of studies, respectively. This imbalance suggests that while researchers may verify that AI models meet technical requirements, they are not consistently assessing whether these systems are actually interpretable or useful to end users.

 

The findings come amid increasing regulatory scrutiny of AI transparency. Governments worldwide, including the U.S. and the European Union, are pushing for clearer accountability measures to prevent AI systems from making opaque or misleading decisions. The report warns that without robust evaluation standards, AI developers may prioritize compliance with loosely defined explainability requirements rather than meaningful transparency.

 

Policymakers are urged to invest in AI safety standards and build a workforce capable of assessing evaluation methods effectively. The report recommends that regulatory agencies establish clearer guidelines on how AI explainability should be measured, ensuring that evaluations provide actionable insights rather than superficial compliance metrics.

 

“Our findings suggest that explainability policies may not be effective unless they include precise evaluation criteria,” the CSET AI report concludes. “Policymakers should recognize the multidimensional nature of explainability and work toward establishing a structured approach for assessing AI transparency.”

 

With AI systems playing an increasingly influential role in decision-making across industries, the report stresses the importance of ensuring that explainability measures serve their intended purpose—enhancing user understanding, accountability, and trust in AI-driven processes.

 

 

Need Help?

 

If you’re concerned or have questions about how to navigate the AI regulatory landscape, don’t hesitate to reach out to BABL AI. Their Audit Experts can offer valuable insight and ensure you’re informed and compliant.

 

Subscribe to our Newsletter

Keep up with the latest on BABL AI, AI Auditing and
AI Governance News by subscribing to our news letter