Abstract:With the in-depth application of mobile robot in production and life, its path planning ability also needs to develop to both rapidity and environmental adaptability. In order to solve the problems existing in the existing mobile robot path planning using reinforcement learning methods, which are easy to fall into local optimization in the early stage of exploration, repeatedly search the same area, and explore the late convergence rate and slow convergence rate, an improved Q-Learning algorithm is proposed in this study. The algorithm improves the Q matrix assignment method to make the exploration process directional in the early iteration and reduces the collision situation; the Q matrix iterative method is improved to make the Q matrix update forward-looking and avoid repeated exploration in a small area; the random exploration strategy is improved to make full use of environmental information in the early iteration and close to the target point in the later stage. The simulation results of different raster maps show that the algorithm in this paper has higher computational efficiency by reducing the path length, reducing jitter and improving the speed of convergence based on the Q-Learning algorithm.