2016-12-24 A New Softmax Operator for Reinforcement Learning 後で読む https://arxiv.org/pdf/1612.05628.pdf