Umeå University's logo

umu.sePublikasjoner
Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
LRP-based network pruning and policy distillation of robust and non-robust DRL agents for embedded systems
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för tillämpad fysik och elektronik.
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för tillämpad fysik och elektronik.ORCID-id: 0000-0003-4228-2774
School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China.
School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China.
Vise andre og tillknytning
2023 (engelsk)Inngår i: Concurrency and Computation, ISSN 1532-0626, E-ISSN 1532-0634, Vol. 35, nr 19, artikkel-id e7351Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

Reinforcement learning (RL) is an effective approach to developing control policies by maximizing the agent's reward. Deep reinforcement learning uses deep neural networks (DNNs) for function approximation in RL, and has achieved tremendous success in recent years. Large DNNs often incur significant memory size and computational overheads, which may impede their deployment into resource-constrained embedded systems. For deployment of a trained RL agent on embedded systems, it is necessary to compress the policy network of the RL agent to improve its memory and computation efficiency. In this article, we perform model compression of the policy network of an RL agent by leveraging the relevance scores computed by layer-wise relevance propagation (LRP), a technique for Explainable AI (XAI), to rank and prune the convolutional filters in the policy network, combined with fine-tuning with policy distillation. Performance evaluation based on several Atari games indicates that our proposed approach is effective in reducing model size and inference time of RL agents. We also consider robust RL agents trained with RADIAL-RL versus standard RL agents, and show that a robust RL agent can achieve better performance (higher average reward) after pruning than a standard RL agent for different attack strengths and pruning rates.

sted, utgiver, år, opplag, sider
John Wiley & Sons, 2023. Vol. 35, nr 19, artikkel-id e7351
Emneord [en]
embedded systems, knowledge distillation, policy distillation, reinforcement learning
HSV kategori
Identifikatorer
URN: urn:nbn:se:umu:diva-200565DOI: 10.1002/cpe.7351ISI: 000868806400001Scopus ID: 2-s2.0-85139981238OAI: oai:DiVA.org:umu-200565DiVA, id: diva2:1715243
Merknad

Special Issue. 

First published online October 2022.

Tilgjengelig fra: 2022-12-01 Laget: 2022-12-01 Sist oppdatert: 2023-11-09bibliografisk kontrollert
Inngår i avhandling
1. Towards safe and efficient application of deep neural networks in resource-constrained real-time embedded systems
Åpne denne publikasjonen i ny fane eller vindu >>Towards safe and efficient application of deep neural networks in resource-constrained real-time embedded systems
2023 (engelsk)Doktoravhandling, med artikler (Annet vitenskapelig)
Abstract [en]

We consider real-time safety-critical systems that feature closed-loop interactions between the embedded computing system and the physical environment with a sense-compute-actuate feedback loop. Deep Learning (DL) with Deep Neural Networks (DNNs) has achieved success in many application domains, but there are still significant challenges in its application in real-time safety-critical systems that require high levels of safety certification under significant hardware resource constraints. This thesis considers the following overarching research goal: How to achieve safe and efficient application of DNNs in resource-constrained Real-Time Embedded (RTE) systems in the context of safety-critical application domains such as Autonomous Driving? Towards reaching that goal, this thesis presents a set of algorithms and techniques that aim to address three Research Questions (RQs): RQ1: How to achieve accurate and efficient Out-of-Distribution (OOD) detection for DNNs in RTE systems? RQ2: How to predict the performance of DNNs under continuous distribution shifts? RQ3: How to achieve efficient inference of Deep Reinforcement Learning (DRL) agents in RTE systems?

For RQ1, we present a framework for OOD detection based on outlier detection in one or more hidden layers of a DNN with either Isolation Forest (IF) or Local Outlier Factor (LOF). We also perform a comprehensive and systematic benchmark study of multiple well-known OOD detection algorithms in terms of both accuracy and execution time on different hardware platforms, in order to provide a useful reference for the practical deployment of OOD detection algorithms in RTE systems. For RQ2, we present a framework for predicting the performance of DNNs for end-to-end Autonomous Driving under continuous distribution shifts with two approaches: using an Autoencoder that attempts to reconstruct the input image; and applying Anomaly Detection algorithms to the hidden layer(s) of the DNN. For RQ3, we present a framework for model compression of the policy network of a DRL agent for deployment in RTE systems by leveraging the relevance scores computed by Layer-wise Relevance Propagation (LRP) to rank and prune the convolutional filters, combined with fine-tuning using policy distillation.

The algorithms and techniques developed in this thesis have been evaluated on standard datasets and benchmarks. To summarize our findings, we have developed novel OOD detection algorithms with high accuracy and efficiency; identified OOD detection algorithms with relatively high accuracy and low execution times through benchmarking; developed a framework for DNN performance prediction under continuous distribution shifts, and identified most effective Anomaly Detection algorithms for use in the framework; developed a framework for model compression of DRL agents that is effective in reducing model size and inference time for deployment in RTE systems. The research results are expected to assist system designers in the task of safe and efficient application of DNNs in resource-constrained RTE systems.

sted, utgiver, år, opplag, sider
Umeå: Umeå University, 2023. s. 49
Emneord
Machine Learning/Deep Learning, Real-Time Embedded systems, Out-of-Distribution Detection, Distribution Shifts, Deep Reinforcement Learning, Model Compression, Policy Distillation.
HSV kategori
Identifikatorer
urn:nbn:se:umu:diva-214365 (URN)978-91-8070-161-7 (ISBN)978-91-8070-160-0 (ISBN)
Disputas
2023-10-09, Triple Helix, Samverkanshuset, Umeå, 13:00 (engelsk)
Opponent
Veileder
Tilgjengelig fra: 2023-09-18 Laget: 2023-09-12 Sist oppdatert: 2023-09-13bibliografisk kontrollert

Open Access i DiVA

fulltext(1608 kB)75 nedlastinger
Filinformasjon
Fil FULLTEXT03.pdfFilstørrelse 1608 kBChecksum SHA-512
5571ca3332838d0544fff4d552d795511de1d073d256210c7a9bb38ebd1ac18a193a1afc2486efe2e2494505ba7be9c2ab5d2ff6bfd4b3cfb884903e24bc9b17
Type fulltextMimetype application/pdf

Andre lenker

Forlagets fulltekstScopus

Person

Luan, SiyuGu, Zonghua

Søk i DiVA

Av forfatter/redaktør
Luan, SiyuGu, Zonghua
Av organisasjonen
I samme tidsskrift
Concurrency and Computation

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 204 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

doi
urn-nbn

Altmetric

doi
urn-nbn
Totalt: 356 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf