Umeå University's logo

umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
LRP-based network pruning and policy distillation of robust and non-robust DRL agents for embedded systems
Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics.
Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics.ORCID iD: 0000-0003-4228-2774
School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China.
School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China.
Show others and affiliations
2023 (English)In: Concurrency and Computation, ISSN 1532-0626, E-ISSN 1532-0634, Vol. 35, no 19, article id e7351Article in journal (Refereed) Published
Abstract [en]

Reinforcement learning (RL) is an effective approach to developing control policies by maximizing the agent's reward. Deep reinforcement learning uses deep neural networks (DNNs) for function approximation in RL, and has achieved tremendous success in recent years. Large DNNs often incur significant memory size and computational overheads, which may impede their deployment into resource-constrained embedded systems. For deployment of a trained RL agent on embedded systems, it is necessary to compress the policy network of the RL agent to improve its memory and computation efficiency. In this article, we perform model compression of the policy network of an RL agent by leveraging the relevance scores computed by layer-wise relevance propagation (LRP), a technique for Explainable AI (XAI), to rank and prune the convolutional filters in the policy network, combined with fine-tuning with policy distillation. Performance evaluation based on several Atari games indicates that our proposed approach is effective in reducing model size and inference time of RL agents. We also consider robust RL agents trained with RADIAL-RL versus standard RL agents, and show that a robust RL agent can achieve better performance (higher average reward) after pruning than a standard RL agent for different attack strengths and pruning rates.

Place, publisher, year, edition, pages
John Wiley & Sons, 2023. Vol. 35, no 19, article id e7351
Keywords [en]
embedded systems, knowledge distillation, policy distillation, reinforcement learning
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:umu:diva-200565DOI: 10.1002/cpe.7351ISI: 000868806400001Scopus ID: 2-s2.0-85139981238OAI: oai:DiVA.org:umu-200565DiVA, id: diva2:1715243
Note

Special Issue. 

First published online October 2022.

Available from: 2022-12-01 Created: 2022-12-01 Last updated: 2023-11-09Bibliographically approved
In thesis
1. Towards safe and efficient application of deep neural networks in resource-constrained real-time embedded systems
Open this publication in new window or tab >>Towards safe and efficient application of deep neural networks in resource-constrained real-time embedded systems
2023 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

We consider real-time safety-critical systems that feature closed-loop interactions between the embedded computing system and the physical environment with a sense-compute-actuate feedback loop. Deep Learning (DL) with Deep Neural Networks (DNNs) has achieved success in many application domains, but there are still significant challenges in its application in real-time safety-critical systems that require high levels of safety certification under significant hardware resource constraints. This thesis considers the following overarching research goal: How to achieve safe and efficient application of DNNs in resource-constrained Real-Time Embedded (RTE) systems in the context of safety-critical application domains such as Autonomous Driving? Towards reaching that goal, this thesis presents a set of algorithms and techniques that aim to address three Research Questions (RQs): RQ1: How to achieve accurate and efficient Out-of-Distribution (OOD) detection for DNNs in RTE systems? RQ2: How to predict the performance of DNNs under continuous distribution shifts? RQ3: How to achieve efficient inference of Deep Reinforcement Learning (DRL) agents in RTE systems?

For RQ1, we present a framework for OOD detection based on outlier detection in one or more hidden layers of a DNN with either Isolation Forest (IF) or Local Outlier Factor (LOF). We also perform a comprehensive and systematic benchmark study of multiple well-known OOD detection algorithms in terms of both accuracy and execution time on different hardware platforms, in order to provide a useful reference for the practical deployment of OOD detection algorithms in RTE systems. For RQ2, we present a framework for predicting the performance of DNNs for end-to-end Autonomous Driving under continuous distribution shifts with two approaches: using an Autoencoder that attempts to reconstruct the input image; and applying Anomaly Detection algorithms to the hidden layer(s) of the DNN. For RQ3, we present a framework for model compression of the policy network of a DRL agent for deployment in RTE systems by leveraging the relevance scores computed by Layer-wise Relevance Propagation (LRP) to rank and prune the convolutional filters, combined with fine-tuning using policy distillation.

The algorithms and techniques developed in this thesis have been evaluated on standard datasets and benchmarks. To summarize our findings, we have developed novel OOD detection algorithms with high accuracy and efficiency; identified OOD detection algorithms with relatively high accuracy and low execution times through benchmarking; developed a framework for DNN performance prediction under continuous distribution shifts, and identified most effective Anomaly Detection algorithms for use in the framework; developed a framework for model compression of DRL agents that is effective in reducing model size and inference time for deployment in RTE systems. The research results are expected to assist system designers in the task of safe and efficient application of DNNs in resource-constrained RTE systems.

Place, publisher, year, edition, pages
Umeå: Umeå University, 2023. p. 49
Keywords
Machine Learning/Deep Learning, Real-Time Embedded systems, Out-of-Distribution Detection, Distribution Shifts, Deep Reinforcement Learning, Model Compression, Policy Distillation.
National Category
Computer Sciences
Identifiers
urn:nbn:se:umu:diva-214365 (URN)978-91-8070-161-7 (ISBN)978-91-8070-160-0 (ISBN)
Public defence
2023-10-09, Triple Helix, Samverkanshuset, Umeå, 13:00 (English)
Opponent
Supervisors
Available from: 2023-09-18 Created: 2023-09-12 Last updated: 2023-09-13Bibliographically approved

Open Access in DiVA

fulltext(1608 kB)37 downloads
File information
File name FULLTEXT03.pdfFile size 1608 kBChecksum SHA-512
5571ca3332838d0544fff4d552d795511de1d073d256210c7a9bb38ebd1ac18a193a1afc2486efe2e2494505ba7be9c2ab5d2ff6bfd4b3cfb884903e24bc9b17
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Luan, SiyuGu, Zonghua

Search in DiVA

By author/editor
Luan, SiyuGu, Zonghua
By organisation
Department of Applied Physics and Electronics
In the same journal
Concurrency and Computation
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 166 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 316 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf