Fusion of Supervised Learning and Reinforcement Learning for Dynamic Treatment Recommendation