Building the Next Generation of Multimodal Models