Abstract:Aiming at the memory bandwidth bottleneck of FPGA hardware accelerator of convolutional neural network algorithm, this paper proposes a Secondary Cache-Row Recombination (SC-RR) based on secondary cache. By analyzing the performance of SDRAM memory, FPGA hardware acceleration principle and memory bandwidth bottleneck, a secondary cache mechanism is established. This mechanism can serve the stacked access requests during the acceleration process, reducing the additional overhead of Active and Precharge operations by merging access requests from the same Bank/Row. The experimental test results show that under the SC-RR scheduling strategy, the memory access time is reduced by 32.87%, the power consumption is reduced by 31.71%, and the effective bandwidth utilization is increased to 91.3%. In the case of similar performance, hardware resource consumption is reduced by 83.8%, which meets the design requirements.