Confidence Scoring for AI-Predicted Antibody–Antigen Complexes: AntiConf as a Precision-driven Metric
Abstract
Accurate determination of antibody-antigen (Ab-Ag) complex structures is critical for therapeutic development. While AI-based methods, beginning with AlphaFold2 (AF2), have revolutionized multimer predictions, the optimal strategies for Ab-Ag modeling and the reliability of their confidence scores remain active areas of research. This study evaluates the performance of AF2, Boltz-1, Boltz-1x, Boltz-2, Chai-1, Protenix and ESMFold, on a curated dataset of 200 antibody-antigen (Ab-Ag) complexes. Our findings reveal that AF2 remains a strong predictor against newer AlphaFold3 (AF3)-inspired methods in Ab-Ag complex prediction. Chai-1 consistently ranks as the second-best performer across multiple success metrics. We observed diverse effects of recycling iterations, with AF2, Chai-1 and Protenix benefiting from increased cycles, unlike Boltz variants. We analyzed various model confidence scores, noting high precision from pDockQ2 and high recall from pTM. By integrating these two scores, we developed AntiConf, a novel metric that achieves superior performance for all methods in terms of precision and recall. These strengths make AntiConf a valuable post score for both computational predictions and downstream experimental workflows, reflecting its potential to improve Ab-Ag complex predictions by AF2 and AF3 architectures. Altogether, this study addresses current limitations in AI-based Ab-Ag complex prediction, showcasing the potential of AntiConf for future predictions and assessments of these complexes, and providing a guideline for improving the accuracy of Ab-Ag complex prediction.
Related articles
Related articles are currently not available for this article.