Importance Prioritized Policy Distillation

Xinghua Qu; Yew Soon Ong; Abhishek Gupta; Pengfei Wei; Zhu Sun; Zejun Ma; ACM

doi:10.1145/3534678.3539266

Back

Conference proceeding

Importance Prioritized Policy Distillation

Xinghua Qu, Yew Soon Ong, Abhishek Gupta, Pengfei Wei, Zhu Sun, Zejun Ma and ACM

Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp.1420-1429

ACM Conferences

KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

14/08/2022

DOI: https://doi.org/10.1145/3534678.3539266

Abstract

Computing methodologies -- Machine learning

Policy distillation (PD) has been widely studied in deep reinforcement learning (RL), while existing PD approaches assume that the demonstration data (i.e., state-action pairs in frames) in a decision making sequence is uniformly distributed. This may bring in unwanted bias since RL is a reward maximizing process instead of simple label matching. Given such an issue, we denote the frame importance as its contribution to the expected reward on a particular frame, and hypothesize that adapting such frame importance could benefit the performance of the distilled student policy. To verify our hypothesis, we analyze why and how frame importance matters in RL settings. Based on the analysis, we propose an importance prioritized PD framework that highlights the training on important frames, so as to learn efficiently. Particularly, the frame importance is measured by the reciprocal of weighted Shannon entropy from a teacher policy's action prescriptions. Experiments on Atari games and policy compression tasks show that capturing the frame importance significantly boosts the performance of the distilled policies.

Metrics

1 Record Views

Details

Title: Importance Prioritized Policy Distillation
Creators - without role: Xinghua Qu - Bytedance AI Lab, Singapore, Singapore
Yew Soon Ong - Nanyang Technological University
Abhishek Gupta - Singapore Institute of Manufacturing Technology
Pengfei Wei - Bytedance AI Lab, Singapore, Singapore
Zhu Sun - Agency for Science, Technology and Research
Zejun Ma - Bytedance AI Lab, Beijing, China
ACM
Publication Details: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp.1420-1429
Conference: KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
Series: ACM Conferences
Publisher: ACM
Number of pages: 10
Identifiers: 9912423109846
Academic Unit: ISTD Pillar
Language: English
Resource Type: Conference proceeding

Importance Prioritized Policy Distillation

Abstract

Metrics

Details

Singapore University of Technology and Design Social media