
Από Wiki Τμήματος Μαθηματικών

Βιβλιογραφία για MDPs:

  1. Bertsekas, D. P., Dynamic Programming and Optimal Control, vol. I and II, Athena Scientific, 1995. (Later editions, vol. I, 2017 and vol. 2, 2012)
  2. Bäuerle, N., Rieder, U. (2011). Markov decision processes with applications to finance. Springer Science & Business Media.
  3. Boucherie, R. J., & van Dijk, N. M. (Eds.) (2017). Markov Decision Processes in Practice. (International Series in Operations Research & Management Science; Vol. 248). Springer. https://doi.org/10.1007/978-3-319-47766-4
  4. Chakravorty, J., & Mahajan, A. (2014). Multi-Armed Bandits, Gittins Index, and its Calculation. Methods and applications of statistics in clinical trials: Planning, analysis, and inferential methods, 2, 416-435.
  5. Feinberg, E. A., & Shwartz, A. (Eds.). (2012). Handbook of Markov decision processes: methods and applications (Vol. 40). Springer Science & Business Media.
  6. Koole, G. (2007). Monotonicity in Markov reward and decision chains: Theory and applications. Foundations and Trends® in Stochastic Systems, 1(1), 1-76.
  7. Puterman, M. L. (2014). Markov decision processes: discrete stochastic dynamic programming. John Wiley & Sons.
  8. Ross, S. M. (2013). Applied probability models with optimization applications. Courier Corporation.
  9. A concise introduction to MDPs can be found in Chapter 17 of M. Mohri, A. Rostamizadeh, and A. Talwalkar. Foundations of Machine Learning, MIT Press, 2018.
  10. Sigaud, O., & Buffet, O. (Eds.). (2013). Markov decision processes in artificial intelligence. John Wiley & Sons.

Βιβλιογραφία για RL:

  1. Agarwal, N. Jiang, S. Kakade, W. Sun. Reinforcement Learning Theory and Applications, Working Book.
  2. Bertsekas, D. P., Tsitsiklis, J. N. (1996). Neuro-dynamic programming. Athena Scientific.
  3. Bertsekas, D.P. (2019). Reinforcement learning and optimal control. Athena Scientific.
  4. Meyn, S.P. (2022). Control Systems and Reinforcement Learning, Cambridge University Press.
  5. Powell, W. B. (2007). Approximate Dynamic Programming: Solving the curses of dimensionality (Vol. 703). John Wiley & Sons.
  6. Sutton, R.S., Barto, A.G. (2018). Reinforcement Learning: An Introduction, MIT Press.

Συναφή επιστημονικά περιοδικά:

  1. Operations Research (INFORMS)
  2. Mathematics of Operations Research (INFORMS)
  3. European Journal of Operations Research (Elsevier)