Abstract:Aiming at the problems of unstable convergence and poor optimization effect in the multi-objective task scheduling of deep reinforcement learning in the edge computing environment, a new multi-objective task scheduling algorithm based on an improved competitive deep double-Q network (IMTS-D3QN) was proposed. First, the selection and calculation of the target Q value are decoupled by the deep double-Q network to eliminate overestimation, the immediate reward experience sample classification method is adopted to extract experience samples from the experience replay unit, which improves the utilization rate of actual samples, which speeds up the training speed of the neural network. Then, the neural network is optimized by introducing competing network structures. Finally, the soft update method is used to improve the stability of the algorithm, and the dynamic ε-greedy exponential decreasing method is used to find the optimal strategy. The Pareto optimal solution is obtained through different linear weighting combinations to minimize the response time and energy consumption. The experimental results show that, compared with other algorithms, the IMTS-D3QN algorithm has obvious optimization effect in response time and energy consumption under different number of tasks.