gp_help.txt

benchmarks for best implementation and comparison of optimal performance RL algo:
https://spinningup.openai.com/en/latest/spinningup/bench.html#walker2d-pytorch-versions
The Spinning Up implementation of DDPG does not support parallelization.

Extensive experiments are carried out on the dataset of
Alibaba cluster-trace-v2018. It is an open-source, real
production clusters workload traces for an 8-day-long period
and 4000 machines. The dataset contains characteristics of the
servers in terms of computational resources (CPU, memory,
communication bandwidth, etc.) and characteristics of online
tasks and other tasks
DB Ref:  https://github.com/alibaba/clusterdata/tree/master/cluster-trace-v2018/


Ask prof. help get paper:
Juan Chen, Huanlai Xing, Zhiwen Xiao, Lexi Xu, and Tao Tao. A
drl agent for jointly optimizing computation offloading and resource
allocation in mec. IEEE Internet of Things Journal, pages 1–1, 2021 (DDPG algo.)
Yang Kunpeng, Hangguan Shan, Tengxu Sun, Roland Hu, Yingxiao Wu,
Lu Yu, Zhaoyang Zhang, and Tony QS Quek. Reinforcement learningbased mobile edge computing and transmission scheduling for video
surveillance. IEEE Transactions on Emerging Topics in Computing,
pages 1–1, 2021. (DQN algo.)
 Xiantao Jiang, F Richard Yu, Tian Song, and Victor CM Leung.
Intelligent resource allocation for video analytics in blockchain-enabled
internet of autonomous vehicles with edge computing. IEEE Internet of
Things Journal, pages 1–1, 2020 (A3C algo.)

In the paper On-policy vs off-policy, it assumed the propagation delay to be distance/speed. Is it correct?
It placed the training model and inference model both in the non-RT-RIC instead of placing training model in near-RT-RIC

Next: comparitive analysis of td3 plot, Generative AI for policy gen, system modelling, Meta Learning, Hybrid Learning,