Optimization algorithm based on value decomposition of multi-agent reinforcement learning
DOI:
Author:
Affiliation:

School of Automation, Nanjing University of Information Science & Technology,Nanjing 210044, China

Clc Number:

TP242

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    The current multi-agent reinforcement learning algorithm cannot fully consider the cooperative relationship between multi-agents in the value decomposition algorithm, and the stochastic strategy used in the exploration process is prone to cross the optimal point and fall into the local optimal solution. Aiming at the above problems, this paper proposes a deep communication multi-agent reinforcement learning algorithm. This paper designs a communication mechanism in value decomposition network by using convolution and fully connected structure to enhance the cooperation between multi-agents. Then, a new adaptive exploration strategy is proposed in this paper. In order to balance the contradiction between data exploration and utilization, a periodic decay strategy is added. Finally, simulation results verify that the proposed method achieves 25.8% performance improvement in some scenarios, and improves the cooperation capability of multi-agent.

    Reference
    Related
    Cited by
Get Citation
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:
  • Revised:
  • Adopted:
  • Online: February 18,2024
  • Published: