Optimizing Fpga Architecture For Deep Learning Workloads