Abstract:The current multi-agent reinforcement learning algorithm cannot fully consider the cooperative relationship between multi-agents in the value decomposition algorithm, and the stochastic strategy used in the exploration process is prone to cross the optimal point and fall into the local optimal solution. Aiming at the above problems, this paper proposes a deep communication multi-agent reinforcement learning algorithm. This paper designs a communication mechanism in value decomposition network by using convolution and fully connected structure to enhance the cooperation between multi-agents. Then, a new adaptive exploration strategy is proposed in this paper. In order to balance the contradiction between data exploration and utilization, a periodic decay strategy is added. Finally, simulation results verify that the proposed method achieves 25.8% performance improvement in some scenarios, and improves the cooperation capability of multi-agent.