AI RESEARCH

Data-Rate-Aware High-Speed CNN Inference on FPGAs

arXiv CS.LG

ArXi:2603.08726v1 Announce Type: cross Dataflow-based CNN accelerators on FPGAs achieve low latency and high throughput by mapping computations of each layer directly to corresponding hardware units. However, layers such as pooling and strided convolutions reduce the data at their output with respect to their input, strongly effecting the data rate of the following layers. This leads to underutilization in fully unrolled designs. While prior work This paper presents a data-rate-aware CNN accelerator architecture for multi-pixel processing.