Ensemble Learning on Deep Neural Networks for Image Caption Generation