Rezumat articol ediţie STUDIA UNIVERSITATIS BABEŞ-BOLYAI

În partea de jos este prezentat rezumatul articolului selectat. Pentru revenire la cuprinsul ediţiei din care face parte acest articol, se accesează linkul din titlu. Pentru vizualizarea tuturor articolelor din arhivă la care este autor/coautor unul din autorii de mai jos, se accesează linkul din numele autorului.

 
       
         
    STUDIA INFORMATICA - Ediţia nr.3 din 2011  
         
  Articol:   GEODESIC DISTANCE-BASED KERNEL CONSTRUCTION FOR GAUSSIAN PROCESS VALUE FUNCTION APPROXIMATION.

Autori:  HUNOR JAKAB.
 
       
         
  Rezumat:  

Finding accurate approximations to state and action value functions is essential in Reinforcement learning tasks on continuous Markov Decision Processes. Using Gaussian processes as function approximators we can simultaneously represent model confidence and generalize to unvisited states. To improve the accuracy of the value function approximation in this article I present a new method of constructing geodesic distance based kernel functions from the Markov Decision process induced graph structure. Using sparse on-line Gaussian process regression the nodes and edges of the graph structure are allocated during on-line learning parallel with the inclusion of new measurements to the basis vector set. This results in a more compact and efficient graph structure and more accurate value function estimates. The approximation accuracy is tested on a simulated robotic control task.

Key words and phrases. Reinforcement learning, Gaussian processes.

 
         
     
         
         
      Revenire la pagina precedentă