Reinforcement learning of a feedforward controller with soft actor-critic for a reaching task