Umeå universitets logga

umu.sePublikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
LRP-based network pruning and policy distillation of robust and non-robust DRL agents for embedded systems
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för tillämpad fysik och elektronik.
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för tillämpad fysik och elektronik.ORCID-id: 0000-0003-4228-2774
School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China.
School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China.
Visa övriga samt affilieringar
2023 (Engelska)Ingår i: Concurrency and Computation, ISSN 1532-0626, E-ISSN 1532-0634, Vol. 35, nr 19, artikel-id e7351Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

Reinforcement learning (RL) is an effective approach to developing control policies by maximizing the agent's reward. Deep reinforcement learning uses deep neural networks (DNNs) for function approximation in RL, and has achieved tremendous success in recent years. Large DNNs often incur significant memory size and computational overheads, which may impede their deployment into resource-constrained embedded systems. For deployment of a trained RL agent on embedded systems, it is necessary to compress the policy network of the RL agent to improve its memory and computation efficiency. In this article, we perform model compression of the policy network of an RL agent by leveraging the relevance scores computed by layer-wise relevance propagation (LRP), a technique for Explainable AI (XAI), to rank and prune the convolutional filters in the policy network, combined with fine-tuning with policy distillation. Performance evaluation based on several Atari games indicates that our proposed approach is effective in reducing model size and inference time of RL agents. We also consider robust RL agents trained with RADIAL-RL versus standard RL agents, and show that a robust RL agent can achieve better performance (higher average reward) after pruning than a standard RL agent for different attack strengths and pruning rates.

Ort, förlag, år, upplaga, sidor
John Wiley & Sons, 2023. Vol. 35, nr 19, artikel-id e7351
Nyckelord [en]
embedded systems, knowledge distillation, policy distillation, reinforcement learning
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
URN: urn:nbn:se:umu:diva-200565DOI: 10.1002/cpe.7351ISI: 000868806400001Scopus ID: 2-s2.0-85139981238OAI: oai:DiVA.org:umu-200565DiVA, id: diva2:1715243
Anmärkning

Special Issue. 

First published online October 2022.

Tillgänglig från: 2022-12-01 Skapad: 2022-12-01 Senast uppdaterad: 2023-11-09Bibliografiskt granskad
Ingår i avhandling
1. Towards safe and efficient application of deep neural networks in resource-constrained real-time embedded systems
Öppna denna publikation i ny flik eller fönster >>Towards safe and efficient application of deep neural networks in resource-constrained real-time embedded systems
2023 (Engelska)Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
Abstract [en]

We consider real-time safety-critical systems that feature closed-loop interactions between the embedded computing system and the physical environment with a sense-compute-actuate feedback loop. Deep Learning (DL) with Deep Neural Networks (DNNs) has achieved success in many application domains, but there are still significant challenges in its application in real-time safety-critical systems that require high levels of safety certification under significant hardware resource constraints. This thesis considers the following overarching research goal: How to achieve safe and efficient application of DNNs in resource-constrained Real-Time Embedded (RTE) systems in the context of safety-critical application domains such as Autonomous Driving? Towards reaching that goal, this thesis presents a set of algorithms and techniques that aim to address three Research Questions (RQs): RQ1: How to achieve accurate and efficient Out-of-Distribution (OOD) detection for DNNs in RTE systems? RQ2: How to predict the performance of DNNs under continuous distribution shifts? RQ3: How to achieve efficient inference of Deep Reinforcement Learning (DRL) agents in RTE systems?

For RQ1, we present a framework for OOD detection based on outlier detection in one or more hidden layers of a DNN with either Isolation Forest (IF) or Local Outlier Factor (LOF). We also perform a comprehensive and systematic benchmark study of multiple well-known OOD detection algorithms in terms of both accuracy and execution time on different hardware platforms, in order to provide a useful reference for the practical deployment of OOD detection algorithms in RTE systems. For RQ2, we present a framework for predicting the performance of DNNs for end-to-end Autonomous Driving under continuous distribution shifts with two approaches: using an Autoencoder that attempts to reconstruct the input image; and applying Anomaly Detection algorithms to the hidden layer(s) of the DNN. For RQ3, we present a framework for model compression of the policy network of a DRL agent for deployment in RTE systems by leveraging the relevance scores computed by Layer-wise Relevance Propagation (LRP) to rank and prune the convolutional filters, combined with fine-tuning using policy distillation.

The algorithms and techniques developed in this thesis have been evaluated on standard datasets and benchmarks. To summarize our findings, we have developed novel OOD detection algorithms with high accuracy and efficiency; identified OOD detection algorithms with relatively high accuracy and low execution times through benchmarking; developed a framework for DNN performance prediction under continuous distribution shifts, and identified most effective Anomaly Detection algorithms for use in the framework; developed a framework for model compression of DRL agents that is effective in reducing model size and inference time for deployment in RTE systems. The research results are expected to assist system designers in the task of safe and efficient application of DNNs in resource-constrained RTE systems.

Ort, förlag, år, upplaga, sidor
Umeå: Umeå University, 2023. s. 49
Nyckelord
Machine Learning/Deep Learning, Real-Time Embedded systems, Out-of-Distribution Detection, Distribution Shifts, Deep Reinforcement Learning, Model Compression, Policy Distillation.
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
urn:nbn:se:umu:diva-214365 (URN)978-91-8070-161-7 (ISBN)978-91-8070-160-0 (ISBN)
Disputation
2023-10-09, Triple Helix, Samverkanshuset, Umeå, 13:00 (Engelska)
Opponent
Handledare
Tillgänglig från: 2023-09-18 Skapad: 2023-09-12 Senast uppdaterad: 2023-09-13Bibliografiskt granskad

Open Access i DiVA

fulltext(1608 kB)66 nedladdningar
Filinformation
Filnamn FULLTEXT03.pdfFilstorlek 1608 kBChecksumma SHA-512
5571ca3332838d0544fff4d552d795511de1d073d256210c7a9bb38ebd1ac18a193a1afc2486efe2e2494505ba7be9c2ab5d2ff6bfd4b3cfb884903e24bc9b17
Typ fulltextMimetyp application/pdf

Övriga länkar

Förlagets fulltextScopus

Person

Luan, SiyuGu, Zonghua

Sök vidare i DiVA

Av författaren/redaktören
Luan, SiyuGu, Zonghua
Av organisationen
Institutionen för tillämpad fysik och elektronik
I samma tidskrift
Concurrency and Computation
Datavetenskap (datalogi)

Sök vidare utanför DiVA

GoogleGoogle Scholar
Totalt: 195 nedladdningar
Antalet nedladdningar är summan av nedladdningar för alla fulltexter. Det kan inkludera t.ex tidigare versioner som nu inte längre är tillgängliga.

doi
urn-nbn

Altmetricpoäng

doi
urn-nbn
Totalt: 340 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf