Abstract:In order to avoid the occurrence of traffic accidents caused by fatigue driving and to safeguard urban road traffic as well as occupant safety, this project addresses the core problems in traditional fatigue driving detection methods, such as low accuracy, elaborate parameters and poor generalization, by using the MTCNN and infrared-based rPPG to accurately extract driver’s facial and physiological information in complex driving environments with changing light, partial occlusion and head deflection. At the same time, after deep mining the specific fatigue information of multi-modal modes, combined with the multi-loss reconstruction(MLR) feature fusion module to use the complementary information between each mode are employed to further construct the multimodality feature integration model, which breaks the limitation of single-mode detection methods and improves its the accuracy and robustness in complicated driving environments. Finally, by using the time-series nature of fatigue, a fatigue driving detection system based on the Bi-LSTM model is established. Experiments were conducted on a home-made dataset FAHD, which demonstrated the reliability of the infrared physiological feature extraction model. In addition, the accuracy of multimodal input increased by at least 5.6% compared to the single-modal input, while the correlation coefficient improved by 5.6% and the root mean square error was reduced by 25% compared to existing fusion methods, achieving an accuracy of 96.7%. While promoting the development of intelligent transportation, it also has a good positive significance for the maintenance of traffic safety.