Abstract:In order to improve the ability of data retrieval and mining in large-scale information management system, a data mining technology of large scale information management system is proposed based on semantic association feature extraction. The cloud storage model is constructed to design the big data distributed storage in large information management system, and the optimized structure of the information management system is reorganized with the feature recombination method of big data information flow. The semantic association dimension feature quantity of the information management distribution data is extracted from the reorganized information management system topology, and the integrated scheduling and data mining of the information management system is carried out using the semantic association feature quantity as the training sample set. The fuzzy C-means algorithm is used for adaptive fusion and clustering of semantic association features of distributed data in large-scale information management system, and the feature compressor is used to reduce the dimension of storage space of large information management system. Improve the ability of target data mining and adaptive scheduling of information management system. The simulation results show that the method is accurate and semantic association clustering is strong, which improves the retrieval and scheduling ability of the target data in the information management system.