Umeå universitets logga

umu.sePublikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
LRP-based Policy Pruning and Distillation of Reinforcement Learning Agents for Embedded Systems
Nanjing University of Science and Technology, School of Computer Science and Engineering, Nanjing, China.
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för tillämpad fysik och elektronik.
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för tillämpad fysik och elektronik.ORCID-id: 0000-0003-4228-2774
Nanjing University of Science and Technology, School of Computer Science and Engineering, Nanjing, China.
Visa övriga samt affilieringar
2022 (Engelska)Ingår i: 2022 IEEE 25th International Symposium on Real-Time Distributed Computing, ISORC 2022, Institute of Electrical and Electronics Engineers (IEEE), 2022Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Reinforcement Learning (RL) is an effective approach to developing control policies by maximizing the agent's reward. Deep Reinforcement Learning (DRL) uses Deep Neural Networks (DNNs) for function approximation in RL, and has achieved tremendous success in recent years. Large DNNs often incur significant memory size and computational overheads, which greatly impedes their deployment into resource-constrained embedded systems. For deployment of a trained RL agent on embedded systems, it is necessary to compress the Policy Network of the RL agent to improve its memory and computation efficiency. In this paper, we perform model compression of the Policy Network of an RL agent by leveraging the relevance scores computed by Layer-wise Relevance Propagation (LRP), a technique for Explainable AI (XAI), to rank and prune the convolutional filters in the Policy Network, combined with fine-Tuning with Policy Distillation. Performance evaluation based on several Atari games indicates that our proposed approach is effective in reducing model size and inference time of RL agents.

Ort, förlag, år, upplaga, sidor
Institute of Electrical and Electronics Engineers (IEEE), 2022.
Nyckelord [en]
embedded systems, Knowledge Distillation, Policy Distillation, Reinforcement Learning
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
URN: urn:nbn:se:umu:diva-198614DOI: 10.1109/ISORC52572.2022.9812837ISI: 000863009700006Scopus ID: 2-s2.0-85135377522ISBN: 9781665406277 (digital)OAI: oai:DiVA.org:umu-198614DiVA, id: diva2:1693944
Konferens
25th IEEE International Symposium on Real-Time Distributed Computing, ISORC 2022, Västerås, Sweden, 17-18 May, 2022.
Tillgänglig från: 2022-09-08 Skapad: 2022-09-08 Senast uppdaterad: 2023-09-05Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltextScopus

Person

Luan, SiyuGu, Zonghua

Sök vidare i DiVA

Av författaren/redaktören
Luan, SiyuGu, Zonghua
Av organisationen
Institutionen för tillämpad fysik och elektronik
Datavetenskap (datalogi)

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetricpoäng

doi
isbn
urn-nbn
Totalt: 133 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf