Information theoretic learning methods for Markov decision processes with parametric uncertainty