Abstract:In order to solve the issue that the existing ground-based cloud classification methods can not make full use of multi-modal information, we propose the Dense Fusion Convolutional Neural Network (DFCNN) for multi-modal ground-based cloud classification to effectively integrate the visual features and the multi-modal features of ground-based cloud samples. The DFCNN utilizes convolution neural network as the visual subnet to extract visual features and adopts the multi-modal subnet to extract multi-modal features of cloud samples. There are five Dense Fusion Modules (DFM) in the DFCNN and they are employed to fully fuse visual features and multi-modal features. The DFM could be injected into the subnet independently without changing the original network structure, and therefore it possesses great flexibility. The DFCNN achieves the classification accuracy of 89.14% on the public multi-modal ground-based cloud dataset MGCD, which verifies the effectiveness of the proposed DFCNN for the ground-based cloud classification task.