Watch, Predict, Act: Robot Learning meets Web Videos