An FPGA based Hardware Accelerator with Binary Weights for Deep Neural Networks
The paper describes the implementation of systolic array based hardware accelerator for multilayer perceptrons
(MLP) on FPGA. Full precision hardware implementation of neural network increases resource utilization. Therefore, it is
difficult to fit large neural networks on FPGA. Moreover, these implementations have high power consumption.
Neural networks are implemented with numerous Multiply and Accumulate (MAC) units. The multipliers in these MAC
units are expensive in terms of power. Algorithms have been proposed which quantize the weights and eliminate the need of
multipliers in a neural network without compromising much on classification accuracy. The algorithms replace MAC units
with simple accumulators. Quantized weights minimize the weight storage requirements.
A systolic array based architecture of neural network has been implemented on FPGA. The architecture has been modified
according to Binary Connect algorithm which quantizes the weights into two levels. All the implementations have been
verified with MNIST dataset. Classification accuracy of hardware implementations has been found comparable with its
The designed hardware accelerator has achieved reduction in resource utilization by 12.6 times compared to the basic
hardware implementation of neural network with high precision weights, inputs and normal MAC units. The power
consumption also has got reduced by half and the delay of critical path decreased by 2.4 times. Thus, larger neural networks
can be implemented on FPGA that can run at high frequencies with less power.
Keywords - Hardware Accelerator, Systolic Array, Deep Neural Networks