Submodular data selection in ASR language modeling