Towards Temporal Semantic Scene Understanding