Adaptive Confidence-Weighted Policy Aggregation: A Novel Method for Federated Reinforcement Learning

Nematollah Ab Azar
Aref Shahmansoorian
Mohsen Davoudi

0 evaluations Published on May 21, 2025

This article on Sciety

Abstract

This paper proposes an innovative Federated Reinforcement Learning (FRL) approach called the Adaptive Confidence-Weighted Policy Aggregation method, or ACWPA in short. In light of incomplete information and heterogeneous knowledge, ACWPA was developed to combine strengths from multiple agents while canceling their weaknesses in multi-agent tasks. This method dynamically weights the contribution of agents’ policies to provide a global policy based on the agent’s performance and the relevance of their expertise when the information about state-action rewards is partially incomplete. Evaluated on a multi-agent path planning task, ACWPA demonstrates advanced convergence and generalization compared to standard FRL methods like FedAvg and FedProx. Outcomes show that ACWPA increases navigation efficiency by 20% and reduces collision charges by 35% throughout diverse environments, highlighting its capacity to boost collaborative knowledge in multi-agent systems with heterogeneous knowledge. Furthermore, implementing ACWPA on large language models (LLMs) yielded a 15% improvement, indicating that this method has potential applicability in different areas of artificial intelligence.

Related articles are currently not available for this article.