TY - CHAP
T1 - Model-free based automated trajectory optimization for uavs toward data transmission
AU - Cui, Jingjing
AU - DIng, Zhiguo
AU - Deng, Yansha
AU - Nallanathan, Arumugam
PY - 2019/12
Y1 - 2019/12
N2 - In this paper, we consider an unmanned aerial vehicle (UAV) enabled wireless network with a set of ground devices that are randomly distributed in an area and each having a certain amount of data for transmission. The UAV flies over this region from a starting point to a destination. During its flight, the UAV wants to communicate to the ground devices for maximizing the cumulative collected data by optimizing the trajectory of the UAV subject to its flight time constraint. Due to uncertainty in the locations of the ground devices and the communication dynamics, an accurate system model is difficult to acquire and maintain. With the help of stochastic modelling, we present a reinforcement learning based automated trajectory optimization algorithm. By dividing the considered region into small grids with finite state space and action space, we apply the Q-learning based automated trajectory optimization approach for maximizing the cumulative collected data during its flight time. Simulation results demonstrate that the reinforcement learning approach can find an optimal strategy under the flight time constraint.
AB - In this paper, we consider an unmanned aerial vehicle (UAV) enabled wireless network with a set of ground devices that are randomly distributed in an area and each having a certain amount of data for transmission. The UAV flies over this region from a starting point to a destination. During its flight, the UAV wants to communicate to the ground devices for maximizing the cumulative collected data by optimizing the trajectory of the UAV subject to its flight time constraint. Due to uncertainty in the locations of the ground devices and the communication dynamics, an accurate system model is difficult to acquire and maintain. With the help of stochastic modelling, we present a reinforcement learning based automated trajectory optimization algorithm. By dividing the considered region into small grids with finite state space and action space, we apply the Q-learning based automated trajectory optimization approach for maximizing the cumulative collected data during its flight time. Simulation results demonstrate that the reinforcement learning approach can find an optimal strategy under the flight time constraint.
UR - http://www.scopus.com/inward/record.url?scp=85081967565&partnerID=8YFLogxK
U2 - 10.1109/GLOBECOM38437.2019.9013644
DO - 10.1109/GLOBECOM38437.2019.9013644
M3 - Conference paper
T3 - 2019 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM)
BT - 2019 IEEE Global Communications Conference, GLOBECOM 2019 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2019 IEEE Global Communications Conference, GLOBECOM 2019
Y2 - 9 December 2019 through 13 December 2019
ER -