Abstract:To address the problems of inaccurate interaction information with the environment, sparse feedback, and unstable convergence of deep reinforcement learning algorithms in path planning, a dueling deep Q-network algorithm based on adaptive ε-greedy strategy and reward design is proposed. When exploring the environment, the agent uses the ε-greedy strategy with self-adjusting greedy factor, while the exploration rate ε is determined by the convergence degree of the learning algorithm, so that the probability of exploration and exploitation can be reasonably assigned. According to the physical theory of artificial potential field method, a potential field reward function is created which contains a larger gravitational potential field in the target, a repulsive potential field reward near the obstacle, making the agent to reach the end point faster. Simulation experiments are conducted in a 2D grid environment, the results show that the algorithm achieves higher average reward and more stable convergence under different scale maps, with an improvement of 48.04% in path planning success rate, which verifies the effectiveness and robustness of the algorithm in path planning. The method proposed in this paper is compared with the Q-learning, which has a 28.14% improvement in path planning success rate with better environment exploration and path planning capabilities.