Abstract:To solve the problem that the posture of the double hemisphere capsule robot (DHCR) was likely to deviate from the targeted orientation due to nonlinear factors such as viscoelastic damping of the gastrointestinal (GI) tract and DHCR centroid deviation, as well as posture estimation error caused by large visual disparity and motion blur, a self-supervised learning-based rapid posture correction method was proposed. In terms of calibration of the capsule′s initial posture, the influence of the initial rotation angle was eliminated; in terms of estimation of the capsule′s attitude, based on spatial attention block(SAB) and temporal attention module (TAM), a spatial-temporal attention mechanism (TSAM) was designed by replacing a component of the standard convolution with the depthwise separable convolution and embedding it into PoseNet to generate an attention posture estimation network (APEN), which enhanced the model′s ability to extract features. The experimental results show that APEN can increase relative posture estimate accuracy by 52% while keeping inference speed almost unchanged when compared to the current capsule posture estimation method, Endo-SfM. Moreover, the accuracy of posture control is increased by 38.8% with this method, and it can correct the capsule posture in real time, laying the foundation for successful dynamic GI tract diagnosis and therapy.