Abstract:Uncertain object auto detection is the key technology of unmanned intelligent lifting handling, and efficient technology recently used is Instance segmentation model based on deep learning. Due to the limited ability of the existing cases to segment the trunk network to extract the global context feature information of lifting scene images, and the difficulty of the convolutional operators in the convolutional neural networkbased trunk network to model the long range correlation of the receptive field, and the lack of sufficient depth cues when identifying single targets with texture features, a module is designed to integrate the heterogeneous feature information of CNN and Transformer, and Transformer is used to model the global dependency relationship, and it is integrated with the ability of CNN to extract local information. Then, the Dense RepPoints detection network was introduced to construct the case segmentation network for the complex lifting and loading scenarios, which could accurately segment the loading and unloading objects and different surfaces of the objects. Compared with the most advanced method at present, AP increased by 4.95% to 98.82%, mIoU increased by 542% to 9189%, obtaining a good example segmentation effect, solving the key technical problems of intelligent lifting loading and unloading, thus improving the work efficiency and safety of unmanned lifting loading and unloading logistics transportation, and reducing costs.