Umeå University's logo

umu.sePublications
Change search
Link to record
Permanent link

Direct link
Alternative names
Publications (10 of 244) Show all publications
Forough, J., Haddadi, H., Bhuyan, M. H. & Elmroth, E. (2024). Efficient anomaly detection for edge clouds: mitigating data and resource constraints.
Open this publication in new window or tab >>Efficient anomaly detection for edge clouds: mitigating data and resource constraints
2024 (English)Manuscript (preprint) (Other academic)
National Category
Computer Sciences
Identifiers
urn:nbn:se:umu:diva-220244 (URN)
Funder
Umeå UniversityWallenberg AI, Autonomous Systems and Software Program (WASP)
Available from: 2024-01-30 Created: 2024-01-30 Last updated: 2024-01-31
Sundqvist, T., Bhuyan, M. H. & Elmroth, E. (2024). Robust procedural learning for anomaly detection and observability in 5G RAN. IEEE Transactions on Network and Service Management, 21(2), 1432-1445
Open this publication in new window or tab >>Robust procedural learning for anomaly detection and observability in 5G RAN
2024 (English)In: IEEE Transactions on Network and Service Management, E-ISSN 1932-4537, Vol. 21, no 2, p. 1432-1445Article in journal (Refereed) Published
Abstract [en]

Most existing large distributed systems have poor observability and cannot use the full potential of machine learning-based behavior analysis. The system logs, which contain the primary source of information, are unstructured and lack the context needed to track procedures and learn the system’s behavior. This work presents a new trace guideline that enables a component-and procedure-based split of the system logs for the future 5G Radio Access Network (RAN). As the system can be broken into smaller pieces, models can more accurately learn the system’s behavior and use the context to improve anomaly detection and observability. The evaluation result is astonishing; where previously state-of-the-art methods struggle to learn the behavior, a fast, dictionary-based algorithm can detect all anomalies and keep false positives close to zero. Troubleshooters can also more quickly identify anomalies and gain useful insights into the component interaction in RAN.

Place, publisher, year, edition, pages
IEEE, 2024
Keywords
observability, trace guidelines, anomaly detection, Radio Access Network, 5G
National Category
Computer Sciences
Identifiers
urn:nbn:se:umu:diva-206058 (URN)10.1109/TNSM.2023.3321401 (DOI)2-s2.0-85174831102 (Scopus ID)
Funder
Knut and Alice Wallenberg Foundation
Note

Originally included in thesis in manuscript form.

Available from: 2023-03-27 Created: 2023-03-27 Last updated: 2024-05-08Bibliographically approved
Seo, E., Pham, V. & Elmroth, E. (2023). Accelerating convergence in wireless federated learning by sharing marginal data. In: 2023 International Conference on Information Networking (ICOIN): . Paper presented at 37th International Conference on Information Networking, ICOIN 2023, January 11-14, 2023 (pp. 122-127). IEEE
Open this publication in new window or tab >>Accelerating convergence in wireless federated learning by sharing marginal data
2023 (English)In: 2023 International Conference on Information Networking (ICOIN), IEEE, 2023, p. 122-127Conference paper, Published paper (Refereed)
Abstract [en]

Deploying federated learning (FL) over wireless mobile networks can be expensive because of the cost of wireless communication resources. Efforts have been made to reduce communication costs by accelerating model convergence, leading to the development of model-driven methods based on feature extraction, model-integrated algorithms, and client selection. However, the resulting performance gains are limited by the dependence of neural network convergence on input data quality. This work, therefore, investigates the use of marginal shared data (e.g., a single data entry) to accelerate model convergence and thereby reduce communication costs in FL. Experimental results show that sharing even a single piece of data can improve performance by 14.6% and reduce communication costs by 61.13% when using the federated averaging algorithm (FedAvg). Marginal data sharing could therefore be an attractive and practical solution in privacy-flexible environments or collaborative operational systems such as fog robotics and vehicles. Moreover, by assigning new labels to the shared data, it is possible to extend the number of classifying labels of an FL model even when the initial input datasets lack the labels in question.

Place, publisher, year, edition, pages
IEEE, 2023
Series
International conference on information networking, ISSN 1976-7684
Keywords
data sharing, Edge computing, federated learning, wireless mobile network
National Category
Computer Sciences
Identifiers
urn:nbn:se:umu:diva-205646 (URN)10.1109/ICOIN56518.2023.10048937 (DOI)000981938900023 ()2-s2.0-85149182136 (Scopus ID)9781665462686 (ISBN)
Conference
37th International Conference on Information Networking, ICOIN 2023, January 11-14, 2023
Available from: 2023-03-13 Created: 2023-03-13 Last updated: 2023-09-05Bibliographically approved
Forough, J., Bhuyan, M. H. & Elmroth, E. (2023). Anomaly detection and resolution on the edge: solutions and future directions. In: 2023 IEEE International Conference on Service-Oriented System Engineering (SOSE): Proceedings. Paper presented at 17th IEEE International Conference on Service-Oriented System Engineering, SOSE 2023, Athens, Greece, July 17-20, 2023 (pp. 227-238). IEEE
Open this publication in new window or tab >>Anomaly detection and resolution on the edge: solutions and future directions
2023 (English)In: 2023 IEEE International Conference on Service-Oriented System Engineering (SOSE): Proceedings, IEEE, 2023, p. 227-238Conference paper, Published paper (Refereed)
Abstract [en]

Anomaly detection and resolution are crucial in edge clouds to ensure that distributed systems operate reliably and securely. This survey presents a comprehensive overview of anomaly detection and resolution strategies specifically designed for edge cloud environments, exploring their strengths, limitations, and applicability in different scenarios. It explores the unique challenges and characteristics of edge cloud systems, providing an in-depth analysis of existing works and tools. Evaluation metrics and datasets used by different methods are examined to provide insights into assessing the performance and efficacy of anomaly detection and resolution approaches. The paper concludes by identifying open challenges, future research directions, and offering practical recommendations, making it a valuable resource for researchers and practitioners involved in enhancing the reliability and security of edge cloud systems.

Place, publisher, year, edition, pages
IEEE, 2023
Series
Proceedings (IEEE International Symposium on Service-Oriented System Engineering), ISSN 2640-8228, E-ISSN 2642-6587
Keywords
Anomaly detection, Anomaly resolution, Edge clouds, Performance anomalies, Security anomalies
National Category
Computer Sciences Computer Systems
Identifiers
urn:nbn:se:umu:diva-216214 (URN)10.1109/SOSE58276.2023.00034 (DOI)2-s2.0-85174902690 (Scopus ID)979-8-3503-2239-2 (ISBN)979-8-3503-2240-8 (ISBN)
Conference
17th IEEE International Conference on Service-Oriented System Engineering, SOSE 2023, Athens, Greece, July 17-20, 2023
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)EU, Horizon Europe, 101092711
Available from: 2023-11-06 Created: 2023-11-06 Last updated: 2024-01-30Bibliographically approved
Kidane, L., Townend, P., Metsch, T. & Elmroth, E. (2023). Automated hyperparameter tuning for adaptive cloud workload prediction. In: UCC '23: Proceedings of the IEEE/ACM 16th International Conference on Utility and Cloud Computing. Paper presented at CC '23: IEEE/ACM 16th International Conference on Utility and Cloud Computing, Taormina (Messina), Italy, December 4-7, 2023. New York: Association for Computing Machinery (ACM)
Open this publication in new window or tab >>Automated hyperparameter tuning for adaptive cloud workload prediction
2023 (English)In: UCC '23: Proceedings of the IEEE/ACM 16th International Conference on Utility and Cloud Computing, New York: Association for Computing Machinery (ACM), 2023Conference paper, Published paper (Refereed)
Abstract [en]

Efficient workload prediction is essential for enabling timely resource provisioning in cloud computing environments. However, achieving accurate predictions, ensuring adaptability to changing conditions, and minimizing computation overhead pose significant challenges for workload prediction models. Furthermore, the continuous streaming nature of workload metrics requires careful consideration when applying machine learning and data mining algorithms, as manual hyperparameter optimization can be time-consuming and suboptimal. We propose an automated parameter tuning and adaptation approach for workload prediction models and concept drift detection algorithms utilized in predicting future workload. Our method leverages a pre-built knowledge-base based on historical data statistical features, enabling automatic adjustment of model weights and concept drift detection parameters. Additionally, model adaptation is facilitated through a transfer learning approach. We evaluate the effectiveness of our automated approach by comparing it with static approaches using synthetic and real-world datasets. By automating the parameter tuning process and integrating concept drift detection, in our experiments the proposed method enhances the accuracy and efficiency of workload prediction models by 50%.

Place, publisher, year, edition, pages
New York: Association for Computing Machinery (ACM), 2023
Keywords
Cloud computing, Hyperparameter optimization, Workload prediction, Concept drift, Data mining
National Category
Computer Systems
Identifiers
urn:nbn:se:umu:diva-223451 (URN)10.1145/3603166.3632244 (DOI)2-s2.0-85191659681 (Scopus ID)979-8-4007-0234-1 (ISBN)
Conference
CC '23: IEEE/ACM 16th International Conference on Utility and Cloud Computing, Taormina (Messina), Italy, December 4-7, 2023
Funder
Knut and Alice Wallenberg Foundation, 2019.0352eSSENCE - An eScience Collaboration
Available from: 2024-04-16 Created: 2024-04-16 Last updated: 2024-05-13Bibliographically approved
Sundqvist, T., Bhuyan, M. H. & Elmroth, E. (2023). Bottleneck identification and failure prevention with procedural learning in 5G RAN. In: Simmhan Y., Altintas I., Varbanescu A.-L., Balaji P., Prasad A.S., Carnevale L. (Ed.), 2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing (CCGrid): . Paper presented at 23rd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid), Bangalore, India, May 1-4, 2023 (pp. 426-436). IEEE
Open this publication in new window or tab >>Bottleneck identification and failure prevention with procedural learning in 5G RAN
2023 (English)In: 2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing (CCGrid) / [ed] Simmhan Y., Altintas I., Varbanescu A.-L., Balaji P., Prasad A.S., Carnevale L., IEEE, 2023, p. 426-436Conference paper, Published paper (Refereed)
Abstract [en]

To meet the low latency requirements of 5G Radio Access Networks (RAN), it is essential to learn where performance bottlenecks occur. As parts are distributed and virtualized, it becomes troublesome to identify where unwanted delays occur. Today, vendors spend huge manual effort analyzing key performance indicators (KPIs) and system logs to detect these bottlenecks. The 5G architecture allows a flexible scaling of microservices to handle the variation in traffic. But knowing how, when, and where to scale is difficult without a detailed latency analysis. In this article, we propose a novel method that combines procedural learning with latency analysis of system log events. The method, which we call LogGenie, learns the latency pattern of the system at different load scenarios and automatically identifies the parts with the most significant increase in latency. Our evaluation in an advanced 5G testbed shows that LogGenie can provide a more detailed analysis than previous research has achieved and help troubleshooters locate bottlenecks faster. Finally, through experiments, we show how a latency prediction model can dynamically fine-tune the behavior where bottlenecks occur. This lowers resource utilization, makes the architecture more flexible, and allows the system to fulfill its latency requirements.

Place, publisher, year, edition, pages
IEEE, 2023
Keywords
bottleneck detection, latency, RAN, failure prevention, 5G
National Category
Computer Sciences
Identifiers
urn:nbn:se:umu:diva-205960 (URN)10.1109/CCGrid57682.2023.00047 (DOI)2-s2.0-85166323115 (Scopus ID)979-8-3503-0119-9 (ISBN)979-8-3503-0120-5 (ISBN)
Conference
23rd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid), Bangalore, India, May 1-4, 2023
Funder
Knut and Alice Wallenberg Foundation
Available from: 2023-03-24 Created: 2023-03-24 Last updated: 2023-08-15Bibliographically approved
Townend, P., Martí, A. P., De La Iglesia, I., Matskanis, N., Ohlson Timoudas, T., Hallmann, T., . . . Abdou, M. (2023). COGNIT: challenges and vision for a serverless and multi-provider cognitive cloud-edge continuum. In: 2023 IEEE International Conference on Edge Computing and Communications (EDGE): . Paper presented at 2023 IEEE International Conference on Edge Computing and Communications (EDGE), Chicago, Illinois, USA, July 2-8, 2023 (pp. 12-22). IEEE
Open this publication in new window or tab >>COGNIT: challenges and vision for a serverless and multi-provider cognitive cloud-edge continuum
Show others...
2023 (English)In: 2023 IEEE International Conference on Edge Computing and Communications (EDGE), IEEE, 2023, p. 12-22Conference paper, Published paper (Refereed)
Abstract [en]

Use of the serverless paradigm in cloud application development is growing rapidly, primarily driven by its promise to free developers from the responsibility of provisioning, operating, and scaling the underlying infrastructure. However, modern cloud-edge infrastructures are characterized by large numbers of disparate providers, constrained resource devices, platform heterogeneity, infrastructural dynamicity, and the need to orchestrate geographically distributed nodes and devices over public networks. This presents significant management complexity that must be addressed if serverless technologies are to be used in production systems. This position paper introduces COGNIT, a major new European initiative aiming to integrate AI technology into cloud-edge management systems to create a Cognitive Cloud reference framework and associated tools for serverless computing at the edge. COGNIT aims to: 1) support an innovative new serverless paradigm for edge application management and enhanced digital sovereignty for users and developers; 2) enable on-demand deployment of large-scale, highly distributed and self-adaptive serverless environments using existing cloud resources; 3) optimize data placement according to changes in energy efficiency heuristics and application demands and behavior; 4) enable secure and trusted execution of serverless runtimes. We identify and discuss seven research challenges related to the integration of serverless technologies with multi-provider Edge infrastructures and present our vision for how these challenges can be solved. We introduce a high-level view of our reference architecture for serverless cloud-edge continuum systems, and detail four motivating real-world use cases that will be used for validation, drawing from domains within Smart Cities, Agriculture and Environment, Energy, and Cybersecurity.

Place, publisher, year, edition, pages
IEEE, 2023
Series
IEEE International Conference on Edge Computing, E-ISSN 2767-9918
National Category
Computer Systems
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-214140 (URN)10.1109/EDGE60047.2023.00015 (DOI)001063201700002 ()2-s2.0-85173547015 (Scopus ID)979-8-3503-0483-1 (ISBN)979-8-3503-0484-8 (ISBN)
Conference
2023 IEEE International Conference on Edge Computing and Communications (EDGE), Chicago, Illinois, USA, July 2-8, 2023
Funder
EU, Horizon Europe, 101092711
Available from: 2023-09-05 Created: 2023-09-05 Last updated: 2023-11-06Bibliographically approved
Saleh Sedghpour, M. R., Obeso Duque, A., Cai, X., Skubic, B., Elmroth, E., Klein, C. & Tordsson, J. (2023). Hydragen: a microservice benchmark generator. In: C. Ardagna; N. Atukorala; P. Beckman; C.K. Chang; R.N. Chang; C. Evangelinos; J. Fan; G.C. Fox; J. Fox; C. Hagleitner; Z. Jin; T. Kosar; M. Parashar (Ed.), 2023 IEEE 16th international conference on cloud computing (CLOUD): . Paper presented at 16th IEEE International Conference on Cloud Computing, CLOUD 2023, Hybrid/Chicago, July 2-8, 2023 (pp. 189-200). IEEE, 2023-July
Open this publication in new window or tab >>Hydragen: a microservice benchmark generator
Show others...
2023 (English)In: 2023 IEEE 16th international conference on cloud computing (CLOUD) / [ed] C. Ardagna; N. Atukorala; P. Beckman; C.K. Chang; R.N. Chang; C. Evangelinos; J. Fan; G.C. Fox; J. Fox; C. Hagleitner; Z. Jin; T. Kosar; M. Parashar, IEEE, 2023, Vol. 2023-July, p. 189-200Conference paper, Published paper (Refereed)
Abstract [en]

Microservice-based architectures have become ubiq-uitous in large-scale software systems. Experimental cloud re-searchers constantly propose enhanced resource management mechanisms for such systems. These mechanisms need to be eval-uated using both realistic and flexible microservice benchmarks to study in which ways diverse application characteristics can affect their performance and scalability. However, current mi-croservice benchmarks have limitations including static compu-tational complexity, limited architectural scale, and fixed topology (i.e., number of tiers, fan-in, and fan-out characteristics).

We therefore propose HydraGen, a tool that enables re-searchers to systematically generate benchmarks with different computational complexities and topologies, to tackle experimental evaluation of performance at scale for web-serving applications, with a focus on inter-service communication. To illustrate the potential of our open-source tool, we demonstrate how it can reproduce an existing microservice benchmark with preserved architectural properties. We also demonstrate how HydraGen can enrich the evaluation of cloud management systems based on a case study related to traffic engineering.

Place, publisher, year, edition, pages
IEEE, 2023
Series
IEEE International Conference on Cloud Computing, CLOUD, ISSN 2159-6182, E-ISSN 2159-6190
Keywords
microservices, benchmark generator, performance analysis, emulation, validation, cloud systems
National Category
Computer Systems
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-206987 (URN)10.1109/CLOUD60044.2023.00030 (DOI)2-s2.0-85174317366 (Scopus ID)9798350304817 (ISBN)9798350304824 (ISBN)
Conference
16th IEEE International Conference on Cloud Computing, CLOUD 2023, Hybrid/Chicago, July 2-8, 2023
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)Google
Note

Originally included in thesis in manuscript form. 

Available from: 2023-04-24 Created: 2023-04-24 Last updated: 2023-11-06Bibliographically approved
Metsch, T., Viktorsson, M., Hoban, A., Vitali, M., Iyer, R. & Elmroth, E. (2023). Intent-driven orchestration: enforcing service level objectives for cloud native deployments. SN Computer Science, 4(3), Article ID 268.
Open this publication in new window or tab >>Intent-driven orchestration: enforcing service level objectives for cloud native deployments
Show others...
2023 (English)In: SN Computer Science, ISSN 2662-995X, Vol. 4, no 3, article id 268Article in journal (Refereed) Published
Abstract [en]

The introduction of microservices and functions using serverless deployment styles for cloud-native applications will trigger a shift in the orchestration paradigm towards an intent-driven model. In this model we shift from imperatively declaring an object’s state to the declaration of a set of desired intents. Intent-driven orchestration (IDO) enables the management of applications through their service level objectives (SLOs) while minimizing service owner and administrator overhead. By enabling service owners to express the desired target key performance indicator (KPI) objectives for their service components instead of declaratively defining the required state and resources, we enable ease of use and abstraction from underlying platforms. By adding a planning component to a Kubernetes-based orchestration stack, the feasibility of translating service objectives into actionable decisions is demonstrated. As this new architecture component introduces more autonomy in the control plane, a means to evaluate the results of planning is defined.

Place, publisher, year, edition, pages
Springer Nature, 2023
Keywords
Key performance indicator, Resource orchestration, Service deployment planner, Service level objectives, Service orchestration
National Category
Computer Sciences Software Engineering
Identifiers
urn:nbn:se:umu:diva-206010 (URN)10.1007/s42979-023-01698-0 (DOI)2-s2.0-85150462132 (Scopus ID)
Funder
Knut and Alice Wallenberg Foundation, 2019.0352eSSENCE - An eScience Collaboration
Available from: 2023-03-28 Created: 2023-03-28 Last updated: 2023-03-28Bibliographically approved
Seo, E. & Elmroth, E. (2023). MadFed: enhancing federated learning with marginal-data model fusion. IEEE Access, 11, 102669-102680
Open this publication in new window or tab >>MadFed: enhancing federated learning with marginal-data model fusion
2023 (English)In: IEEE Access, E-ISSN 2169-3536, Vol. 11, p. 102669-102680Article in journal (Refereed) Published
Abstract [en]

As the demand for intelligent applications at the network edge grows, so does the need for effective federated learning (FL) techniques. However, FL often relies on non-identically and non-independently distributed local datasets across end devices, which could result in considerable performance degradation. Prior solutions, such as model-driven approaches based on knowledge distillation, meta-learning, and transfer learning, have provided some reprieve. However, their performance suffers under heterogeneous local datasets and highly skewed data distributions. To address these challenges, this study introduces the MArginal Data fusion FEDerated Learning (MadFed) approach, a groundbreaking fusion of model- and data-driven methodologies. By utilizing marginal data, MadFed mitigates data distribution skewness, improves the maximum achievable accuracy, and reduces communication costs. Furthermore, the study demonstrates that the fusion of marginal data can significantly improve performance even with minimal data entries, such as a single entry. For instance, it provides up to a 15.4% accuracy increase and 70.4% communication cost savings when combined with established model-driven methodologies. Conversely, relying solely on these model-driven methodologies can result in poor performance, especially with highly skewed datasets. Significantly, MadFed extends its effectiveness across various FL algorithms and offers a unique method to augment label sets of end devices, thereby enhancing the utility and applicability of federated learning in real-world scenarios. The proposed approach is not only efficient but also adaptable and versatile, promising broader application and potential for widespread adoption in the field.

Place, publisher, year, edition, pages
IEEE, 2023
Keywords
Computational modeling, Costs, Data integration, Data models, Edge computing, Edge Computing, Federated learning, Federated learning, Performance evaluation, Performance evaluation, Training
National Category
Computer Sciences
Identifiers
urn:nbn:se:umu:diva-214778 (URN)10.1109/ACCESS.2023.3315654 (DOI)2-s2.0-85171574845 (Scopus ID)
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)Swedish National Infrastructure for Computing (SNIC)Knut and Alice Wallenberg Foundation
Available from: 2023-10-02 Created: 2023-10-02 Last updated: 2023-10-02Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-2633-6798

Search in DiVA

Show all publications