Аннотация:Reinforcement learning based methods can achieve excellent results for robot locomotion control. However, their serious disadvantage is the long agent training time and large number of parameters defining its behavior. In this paper, we propose a method that significantly reduces training time. It is based on the Policy Modulating Trajectory Generator (PMTG) architecture, which uses Central Pattern Generators (CPG) as a gait generator. We tested this approach on an OpenAI BipedalWalker-v3 environment. The paper presents the results of this algorithm, showing its effectiveness in solving a locomotion problem over challenging terrain.