Abstract
In this brief, a reinforcement learning-based control approach for nonlinear systems is presented. The proposed control approach offers a design scheme of the adjustable policy learning rate (APLR) to reduce the influence imposed by negative or large advantages, which improves the learning stability of the proximal policy optimization (PPO) algorithm. Besides, this brief puts forward a Lyapunov-fuzzy reward system to further promote the learning efficiency. In addition, the proposed control approach absorbs the Lyapunov stability concept into the design of the Lyapunov reward system and a particular fuzzy reward system is set up using the knowledge of the cart-pole inverted pendulum and fuzzy inference system (FIS). The merits of the proposed approach are validated by simulation examples.
Original language | English |
---|---|
Article number | 8871158 |
Pages (from-to) | 2059-2063 |
Number of pages | 5 |
Journal | IEEE Transactions on Circuits and Systems II: Express Briefs |
Volume | 67 |
Issue number | 10 |
DOIs | |
Publication status | Published - Oct 2020 |
Keywords
- adjustable policy learning rate (APLR)
- cart-pole inverted pendulum
- fuzzy reward system
- Lyapunov reward system
- Proximal policy optimization (PPO)