Causal Fairness in Black-Box AI: A Counterfactual Auditing Framework for Deep Models

Nurul Hakim Asif

0 evaluations Published on Jul 3, 2025

This article on Sciety

Abstract

As synthetic intelligence (AI) structures grow to be an increasingly number of embedded in important regions together with finance, crook justice, employment, and healthcare, questions of equity and responsibility are not theoretical—they may be urgent. Many of the maximum influential fashions in those domain names perform as black boxes, generating selections which might be tough to interpret or even more difficult to audit. Traditional fairness metrics, such as Demographic Parity and Equalized Odds, assess disparities across groups but often miss subtler, individual-level biases and fail to consider the causal pathways that link protected attributes to decisions.This paper introduces a model-agnostic framework for evaluating fairness in black-box AI models using counterfactual reasoning. We propose the Counterfactual Fairness Gap (CFG)—a novel metric that quantifies how frequently an individual’s predicted outcome would change if their protected attribute (e.g., race or gender) were counterfactually altered, while maintaining causal consistency through a structural causal model (SCM).Our framework does not require access to internal model architecture or training data, making it broadly applicable in real-world scenarios where models are proprietary or opaque. We apply this method to two widely studied datasets—COMPAS and UCI Adult Income—using three commonly deployed classifiers: Deep Neural Networks, XGBoost, and Random Forests. Empirical results show that CFG identifies significant fairness violations that remain undetected by traditional statistical metrics.In addition to its technical utility, the framework provides practical and regulatory benefits. It supports both pre-deployment and post-deployment auditing and aligns with global AI governance initiatives such as the EU AI Act. By combining causal rigor with operational flexibility, our approach offers a powerful tool for identifying and addressing fairness risks in modern AI systems.

Related articles are currently not available for this article.