Umeå University's logo

umu.sePublications
Change search
Refine search result
1 - 49 of 49
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Argus, Markus
    Umeå University, Faculty of Science and Technology, Department of Physics.
    Machine learning for wavelets to enhance PET reconstruction2021Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    In the field of Nuclear Medicine, positron emission tomography (PET) plays an important role as one of the most common diagnostic tools in the area of medical imaging. However, various physical degradation factors occur when the data is recorded, which leads to a low signal-to-noise ratio. This makes the quality of the PET images less than optimal. The proposed method for solving this problem is to use a machine learning approach to find a sparse representation of the sinograms. Where a suitable sparse representation should retain as much of the signal and as little of the noise as possible. To accomplish this a sparse autoencoder was trained on wavelet decompositions of sinograms simulated from medical images in order to learn underlying structures. Three different wavelet families were tested, Daubechies 4, biorthogonal 4.4, and Haar. The trained model was able to find sparse representations of the input sinograms in the wavelet domain. Although the sparse autoencoder managed to learn the basic structures of the sinograms, it struggled with the more complex details. Compared to a conventional denoising method using hard thresholding the sparse autoencoder did not manage to produce as good of a result in terms of the reconstructed PET image quality.       

    Download full text (pdf)
    fulltext
  • 2.
    Bohlin, Ludvig
    et al.
    Umeå University, Faculty of Science and Technology, Department of Physics.
    Edler, Daniel
    Umeå University, Faculty of Science and Technology, Department of Physics.
    Lancichinetti, Andrea
    Umeå University, Faculty of Science and Technology, Department of Physics.
    Rosval, Martin
    Umeå University, Faculty of Science and Technology, Department of Physics.
    Community Detection and Visualization of Networks with the Map Equation Framework2014In: Measuring Scholarly Impact: Methods and Practice / [ed] Ying Ding, Ronald Rousseau, Dietmar Wolfram, Springer, 2014, p. 3-34Chapter in book (Refereed)
    Abstract [en]

    Large networks contain plentiful information about the organization of a system. The challenge is to extract useful information buried in the structure of myriad nodes and links. Therefore, powerful tools for simplifying and highlighting important structures in networks are essential for comprehending their organization. Such tools are called community-detection methods and they are designed to identify strongly intraconnected modules that often correspond to important functional units. Here we describe one such method, known as the map equation, and its accompanying algorithms for finding, evaluating, and visualizing the modular organization of networks. The map equation framework is very flexible and can identify two-level, multi-level, and overlapping organization in weighted, directed, and multiplex networks with its search algorithm Infomap. Because the map equation framework operates on the flow induced by the links of a network, it naturally captures flow of ideas and citation flow, and is therefore well-suited for analysis of bibliometric networks.

  • 3.
    Buckland, Philip I.
    et al.
    Umeå University, Faculty of Arts, Department of historical, philosophical and religious studies, Environmental Archaeology Lab.
    Sjölander, Mattias
    Umeå University, Faculty of Arts, Department of historical, philosophical and religious studies, Environmental Archaeology Lab.
    Eriksson, Erik J.
    ICT Services and System Development (ITS), Umeå University, Umeå, Sweden.
    Strategic Environmental Archaeology Database (SEAD)2018In: Encyclopedia of global archaeology / [ed] Smith, C., Cham: Springer, 2018, 2Chapter in book (Refereed)
    Abstract [en]

    Environmental archaeology encompasses a wide range of scientific methods for analyzing the results of past human activities, environments, climates, and perhaps, most importantly, the relationships between these. Many of these methods are referred to as proxy analyses, denoting the illumination of the past as interpreted indirectly through the evidence of fossil organisms or properties. These lines of evidence, or proxy data sources, are assumed to reflect past conditions by way of their dependence on them. For example, a species of beetle may only survive within a specific climate range, and thus its presence in samples indicates this climate at the time of deposition; organic waste deposited around a farmstead will raise soil phosphate levels above those of the surrounding land; and the presence of cereal grains in postholes suggests their local cultivation or import, usage, or storage.

  • 4.
    Calatayud, Joaquín
    et al.
    Umeå University, Faculty of Science and Technology, Department of Physics.
    Bernardo-Madrid, Ruben
    Neuman, Magnus
    Umeå University, Faculty of Science and Technology, Department of Physics.
    Rojas, Alexis
    Umeå University, Faculty of Science and Technology, Department of Physics.
    Rosvall, Martin
    Umeå University, Faculty of Science and Technology, Department of Physics.
    Exploring the solution landscape enables more reliable network community detection2019In: Physical review. E, ISSN 2470-0045, E-ISSN 2470-0053, Vol. 100, no 5, article id 052308Article in journal (Refereed)
    Abstract [en]

    To understand how a complex system is organized and functions, researchers often identify communities in the system's network of interactions. Because it is practically impossible to explore all solutions to guarantee the best one, many community-detection algorithms rely on multiple stochastic searches. But for a given combination of network and stochastic algorithms, how many searches are sufficient to find a solution that is good enough? The standard approach is to pick a reasonably large number of searches and select the network partition with the highest quality or derive a consensus solution based on all network partitions. However, if different partitions have similar qualities such that the solution landscape is degenerate, the single best partition may miss relevant information, and a consensus solution may blur complementary communities. Here we address this degeneracy problem with coarse-grained descriptions of the solution landscape. We cluster network partitions based on their similarity and suggest an approach to determine the minimum number of searches required to describe the solution landscape adequately. To make good use of all partitions, we also propose different ways to explore the solution landscape, including a significance clustering procedure. We test these approaches on synthetic networks and a real-world network using two contrasting community-detection algorithms: The algorithm that can identify more general structures requires more searches, and networks with clearer community structures require fewer searches. We also find that exploring the coarse-grained solution landscape can reveal complementary solutions and enable more reliable community detection.

  • 5.
    Chang, Shuangshuang
    et al.
    School of Computer Science and Engineering, Northeastern University, Shenyang, China.
    Bi, Ran
    School Of Computer Science and Technology, Dalian University of Technology, Dalian, China.
    Sun, Jinghao
    School Of Computer Science and Technology, Dalian University of Technology, Dalian, China.
    Liu, Weichen
    School of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore.
    Yu, Qi
    School Of Computer Science and Technology, Dalian University of Technology, Dalian, China.
    Deng, Qingxu
    School of Computer Science and Engineering, Northeastern University, Shenyang, China.
    Gu, Zonghua
    Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics.
    Towards minimum WCRT bound for DAG tasks under prioritized list scheduling algorithms2022In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, Vol. 41, no 11, p. 3874-3885Article in journal (Refereed)
    Abstract [en]

    Many modern real-time parallel applications can be modeled as a directed acyclic graph (DAG) task. Recent studies show that the worst-case response time (WCRT) bound of a DAG task can be significantly reduced when the execution order of the vertices is determined by the priority assigned to each vertex of the DAG. How to obtain the optimal vertex priority assignment, and how far from the best-known WCRT bound of a DAG task to the minimum WCRT bound are still open problems. In this paper, we aim to construct the optimal vertex priority assignment and derive the minimum WCRT bound for the DAG task. We encode the priority assignment problem into an integer linear programming (ILP) formulation. To solve the ILP model efficiently, we do not involve all variables or constraints. Instead, we solve the ILP model iteratively, i.e., we initially solve the ILP model with only a few primary variables and constraints, and then at each iteration, we increment the ILP model with the variables and constraints which are more likely to derive the optimal priority assignment. Experimental work shows that our method is capable of solving the ILP model optimally without involving too many variables or constraints, e.g., for instances with 50 vertices, we find the optimal priority assignment by involving 12.67% variables on average and within several minutes on average.

  • 6. Durango, Jonas
    et al.
    Dellkrantz, Manfred
    Maggio, Martina
    Klein, Cristian
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Papadopoulos, Alessandro Vittorio
    Hernandez-Rodriguez, Francisco
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Arzen, Karl-Erik
    Control-theoretical load-balancing for cloud applications with brownout2014In: 2014 IEEE 53RD ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2014, p. 5320-5327Conference paper (Refereed)
    Abstract [en]

    Cloud applications are often subject to unexpected events like flash crowds and hardware failures. Without a predictable behaviour, users may abandon an unresponsive application. This problem has been partially solved on two separate fronts: first, by adding a self-adaptive feature called brownout inside cloud applications to bound response times by modulating user experience, and, second, by introducing replicas - copies of the applications having the same functionalities - for redundancy and adding a load-balancer to direct incoming traffic. However, existing load-balancing strategies interfere with brownout self-adaptivity. Load-balancers are often based on response times, that are already controlled by the self-adaptive features of the application, hence they are not a good indicator of how well a replica is performing. In this paper, we present novel load-balancing strategies, specifically designed to support brownout applications. They base their decision not on response time, but on user experience degradation. We implemented our strategies in a self-adaptive application simulator, together with some state-of-the-art solutions. Results obtained in multiple scenarios show that the proposed strategies bring significant improvements when compared to the state-of-the-art ones.

  • 7.
    Eklund, Pauline
    Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics.
    Implementering av ISOBUS på ECU vid Ålö AB2017Independent thesis Basic level (professional degree), 10 credits / 15 HE creditsStudent thesis
    Abstract [en]

    A serial bus called ISOBUS based on CAN is becoming more and more common in the agriculture and forestry industry. The bus specifies communication between tractors and their implements. Earlier each implement had its own monitor to show its functionalities, which could lead to a lot of monitors in the tractor cabin. ISOBUS requires only one monitor, called VT (Virtual Terminal), regardless of the manufacturer of the implement.

    The aim of this thesis is to implement ISOBUS at Ålö’s ECU (Electronic Control Unit) so that it can present its functionalities to VT. The aim is to integrate a purchased third party commercial ISOBUS library on ECU. The amount of work to achieve ISOBUS compatibility without third party library shall be estimated, and if there is time the task shall also be carried out. An object pool based on Ålö’s existing interface shall be created, where the object pool is the graphical interface shown at VT. A demonstrator of ISOBUS VT shall be done.

    To implement the third party library hardware functions towards the CAN-bus was required. The hardware functions include receiving messages from a buffer and send messages directly on the bus. For the library to be alive and running it had to be initialized and a periodic call to the library had to be done. The result is that the library was implemented on ECU and data flows between ECU and VT.

    To achieve ISOBUS compatibility without third party library the existing protocol on Ålö’s ECU has to be removed by a base support for ISOBUS. Then a last part must be written to achieve full compatibility. Commands that the ISOBUS standard defines between ECU and VT has to be written, and callback functions that is called when VT sends commands to ECU. Management of answers and errors also have to be implemented. ISOBUS compatibility without third party library wasn’t carried out, but the amount of work was estimated and a general description of what has to be done is written. The conclusion is that it requires a lot of work and scrutiny of the standard. The advantage is that you get an insight into how the system works and the ability to influence functionalities yourself.

    The object pool design was based on Ålö’s existing interface. Menu systems was implemented, and a linear bar graph and a meter have the possibilities to show height and angle of the tractor loader bucket. Different ways to show a menu system has been discussed. The result is an object pool with the basic functions for Ålö’s interface, the demonstrator presents these functionalities. The interface for VT can be made quite similar to Ålö’s existing interface, with some differences such as fonts, image quality and menu functions.

    Download full text (pdf)
    Implementering av ISOBUS på ECU vid Ålö AB
  • 8. Eriksson, Lennart
    et al.
    Trygg, Johan
    Umeå University, Faculty of Science and Technology, Department of Chemistry.
    Wold, Svante
    Umeå University, Faculty of Science and Technology, Department of Chemistry. Umetrics Inc., 42 Pine Hill Rd, Hollis, NH 03049, USA.
    PLS-trees (R), a top-down clustering approach2009In: Journal of Chemometrics, ISSN 0886-9383, E-ISSN 1099-128X, Vol. 23, no 11, p. 569-580Article in journal (Refereed)
    Abstract [en]

    A hierarchical clustering approach based on a set of PLS models is presented. Called PLS-Trees (R), this approach is analogous to classification and regression trees (CART), but uses the scores of PLS regression models as the basis for splitting the clusters, instead of the individual X-variables. The split of one cluster into two is made along the sorted first X-score (t(1)) of a PLS model of the cluster, but may potentially be made along a direction corresponding to a combination of scores. The position of the split is selected according to the improvement of a weighted combination of (a) the variance of the X-score, (b) the variance of Y and (c) a penalty function discouraging an unbalanced split with very different numbers of observations. Cross-validation is used to terminate the branches of the tree, and to determine the number of components of each cluster PLS model. Some obvious extensions of the approach to OPLS-Trees and trees based on hierarchical PLS or OPLS models with the variables divided in blocks depending on their type, are also mentioned. The possibility to greatly reduce the number of variables in each PLS model on the basis of their PLS w-coefficients is also pointed out. The approach is illustrated by means of three examples. The first two examples are quantitative structure-activity relationship (QSAR) data sets, while the third is based on hyperspectral images of liver tissue for identifying different sources of variability in the liver samples.

  • 9.
    Feng, Zhiwei
    et al.
    School of Computer Science and Engineering, Northeastern University, Shenyang, China.
    Gu, Zonghua
    Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics.
    Yu, Haichuan
    School of Computer Science and Engineering, Northeastern University, Shenyang, China.
    Deng, Qingxu
    School of Computer Science and Engineering, Northeastern University, Shenyang, China.
    Niu, Linwei
    Department of Electrical Engineering and Computer Science, Howard University, Washington, USA.
    Online re-routing and re-scheduling of time-triggered flows for fault tolerance in time-sensitive networking2022In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, Vol. 41, no 11, p. 4253-4264Article in journal (Refereed)
    Abstract [en]

    Time-Sensitive Networking (TSN) is an industry-standard networking protocol that is widely deployed in safety-critical industrial and automotive networks thanks to its advantages of deterministic transmission and bounded end-to-end delay for Time-Triggered (TT) flows. In this paper, we focus on TT flows, and address the issue of fault tolerance against permanent and transient faults with both spatial and temporal redundancy. We present an efficient heuristic algorithm for online incremental re-routing and re-scheduling of disrupted flows due to permanent faults, assuming the paths and schedules of existing flows stay fixed and cannot be modified. It is complementary to and can be combined with offline routing and scheduling algorithms for achieving fault tolerance based on Frame Replication and Elimination for Reliability (FRER) (IEEE 802.1CB). Performance evaluation shows that our approach can better recover the system's Degree of Redundancy (DoR) and has a higher acceptance rate than related work.

  • 10.
    Feng, Zhiwei
    et al.
    School of Computer Science and Engineering, Northeastern University, China.
    Wu, Chaoquan
    School of Computer Science and Engineering, Northeastern University, China.
    Deng, Qingxu
    School of Computer Science and Engineering, Northeastern University, China.
    Lin, Yuhan
    School of Computer Science and Engineering, Northeastern University, China.
    Gao, Shichang
    School of Computer Science and Engineering, Northeastern University, China.
    Gu, Zonghua
    Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics.
    On the scheduling of fault-tolerant time-sensitive networking with IEEE 802.1CB2024In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151Article in journal (Refereed)
    Abstract [en]

    Time-Sensitive Networking (TSN) has become the most popular technique in modern safety-critical Automotive and Industrial Automation Networks by providing deterministic transmission policies. However, the data of TSN messages may be affected by transient faults. IEEE 802.1CB, a reliability standard in TSN, protects against such faults by providing disjoint redundant routes for each stream. However, the unique assumption may present a new challenge, i.e., an inadequate number of redundant routes that may negatively impact stream scheduling. This paper presents an offline fault-tolerant TSN scheduling approach that considers such impacts for real-time streams (such as Time-Trigger (TT) and Audio Video Bridging (AVB) streams). Specifically, we intend to calculate the minimum upper bound number of disjoint routes required for each stream to meet the reliability requirements, subsequently enhancing the network’s schedulability. We also propose a service degradation function for AVB streams when the network is under heavy load caused by redundant transmissions of TT streams. This function will maintain schedulability and reliability for AVB streams. Experiments with small-and large-scale synthetic networks show the efficiency.

  • 11.
    Fällman, Daniel
    Umeå University, Faculty of Social Sciences, Department of Informatics.
    Mediated reality through glasses or binoculars? Exploring use models of wearable computing in the context of aircraft maintenance.2003In: International Journal of Human-Computer Interaction, ISSN 1044-7318, E-ISSN 1532-7590, Vol. 15, no 2, p. 265-284Article in journal (Refereed)
    Abstract [en]

    Aircraft maintenance is often considered a typical application for specialized wearable computer systems, designed and used for a specific purpose only. From the findings of an interpretive case study conducted at Scandinavian Airlines Systems, the largest commercial airline in Scandinavia, there is evidence to question the potential usefulness of such a system.

    Instead, in this article, aircraft maintenance is used to explore the potentialities of different use models of wearable computing (i.e., the way the system is designed, used, and understood, and which should also make sense in other environments). The use models are (a) a vertical model addressed by a binoculars-analogy, where the system is designed and used for a specific purpose; and (b) a horizontal model, approached by perceiving wearable computers as eyeglasses, where the system is used throughout the day for a number of activities. Problems with both models suggest an alternative use model, which is presented as the embodied use model, drawing on the notion of embodiment introduced by Ihde (1990).

  • 12.
    Hansson, Joel
    Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics.
    Live captioning and translation application for Android2023Independent thesis Basic level (professional degree), 10 credits / 15 HE creditsStudent thesis
    Abstract [en]

    Captioning has long been used in media to help D/deaf and hard-of-hearing persons. Captioning however is difficult and time-consuming manual work. With the rapid evolution of automated speech recognition (ASR) systems, live captioning of everyday speech will soon be a practical reality. A proof of concept Android application for use with a specific headset has been created using the built-in Android SpeechRecognizer, a free open-source API (application programming interface) available for Android phones. This application unlike many existing solutions focuses on two major features, communication with in-situ microphones and hardware via bluetooth and long-duration speech recognition. Long-duration speech recognition was made possible using the segmented session mode of the SpeechRecognizer which was recently added in API version 33 (March2023). The results while not complete show promise for future development. Some initial testing shows a word error rate (WER) of 8% but further testing is required. Tests with noise also show that the system is surprisingly resistant to static noise. The application shows promise and development will continue in the coming weeks. This project was financed by Hörselforskningsfonden in project FA21-0017 and was performed under the supervision of Amin Saremi. 

    Download full text (pdf)
    fulltext
  • 13.
    He, Qingqiang
    et al.
    The Hong Kong Polytechnic University, Hong Kong, Hong Kong.
    Guan, Nan
    City University of Hong Kong, Hong Kong, Hong Kong.
    Lv, Mingsong
    The Hong Kong Polytechnic University, Hong Kong, Hong Kong; Northeastern University, China.
    Gu, Zonghua
    Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics.
    On the degree of parallelism in real-time scheduling of DAG tasks2023In: 2023 Design, Automation & Test in Europe. Conference & Exhibition (DATE): Proceedings, IEEE, 2023, p. 1-6Conference paper (Refereed)
    Abstract [en]

    Real-time scheduling and analysis of parallel tasks modeled as directed acyclic graphs (DAG) have been intensively studied in recent years. The degree of parallelism of DAG tasks is an important characterization in scheduling. This paper revisits the definition and the computing algorithms for the degree of parallelism of DAG tasks, and clarifies some misunderstandings regarding the degree of parallelism which exist in real-time literature. Based on the degree of the parallelism, we propose a real-time scheduling approach for DAG tasks, which is quite simple but rather effective and outperforms the state-of-the-art by a considerable margin.

  • 14.
    Hjort, Andrej
    Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics.
    Prototyp för analys av körteknik med motorcykel2015Independent thesis Basic level (university diploma), 10 credits / 15 HE creditsStudent thesis
    Abstract [sv]

    Mikroelektromekaniska system eller MEMS, används i allt större utsträcknining i fordon för att samla mätresultat om vinkel samt acceleration som sedan kan användas i samband med andra system i fordonet för ökad säkerhet och stabilitet. I bilar har smarta system använts sedan länge för airbag-system och ABS bromsar. Då just MEMS-sensorer har fördelar som pris, storlek och tillgänglighet ses allt fler möjligheter även för utveckling kring motorcyklar. Målet med detta projekt har varit att ta fram en prispressad och lite enklare produkt där föraren kan se information om sin lutning och acceleration med hjälp av en accelerometer och gyroskop . Tester har gjorts med produkten fastmonterad på motorcykel på data såsom lutning och acceleration efter ett så kallat komplementärfilter. Detta har gjorts för att se vilka värden på filtret som fungerar bäst för en eventuell färdig produkt. Utifrån testerna kan man utgöra att motorvibrationer har en påverkan för mätning av acceleration och vinkel. Dessa minskar i viss mån med vibrationsdämpande lösningar såsom gummiupphängningar men för att kunna få ut mätbara värden implementeras ett mjukvaru lågpassfilter samt ett komplementärfilter.

    Download full text (pdf)
    fulltext
  • 15.
    Jiang, Zhe
    et al.
    University of York, United Kingdom.
    Dai, Xiaotian
    University of York, United Kingdom.
    Burns, Alan
    University of York, United Kingdom.
    Audsley, Neil
    City, University of London, United Kingdom.
    Gu, Zonghua
    Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics.
    Gray, Ian
    University of York, United Kingdom.
    A high-resilience imprecise computing architecture for mixed-criticality systems2023In: IEEE Transactions on Computers, ISSN 0018-9340, E-ISSN 1557-9956, Vol. 72, no 1, p. 29-42Article in journal (Refereed)
    Abstract [en]

    Conventional mixed-criticality systems (MCS)s are designed to terminate the execution of less critical tasks in exceptional situations so that the timing properties of more critical tasks can be preserved. Such a strategy can be controversial and has proven difficult to implement in practice, as it can lead to hazards and reduced functionality due to the absence of the discarded tasks. To mitigate this issue, the imprecise mixed-critically system model (IMCS) has been proposed. In such a model, instead of completely dropping less-critical tasks, these tasks are executed as much as possible through the use of decreased computation precision. Although IMCS could effectively improve the survivability of the less-critical tasks, it also introduces three key drawbacks - run-time computation errors, real-time performance degradation, and lack of flexibility. In this paper, we present a novel IMCS framework, which can (i) mitigate the computation errors caused by imprecise computation; (ii) achieve real-time performance near to that of a conventional MCS; (iii) enhance system-level throughput; and (iv) provide flexibility for run-time configuration. We describe the design details of HIART-MCS, and then present the corresponding theoretical analysis and optimisation method for its run-time configuration. Finally, HIART-MCS is evaluated against other MCS frameworks using a variety of experimental metrics.

  • 16.
    Jiang, Zhe
    et al.
    Southeast University, Nanjing, China.
    Dai, Xiaotian
    University of York, United Kingdom.
    Wei, Ran
    University of Cambridge, United Kingdom.
    Gray, Ian
    University of York, United Kingdom.
    Gu, Zonghua
    Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics.
    Zhao, Qingling
    Nanjing University of Science and Technology, China.
    Zhao, Shuai
    Sun Yat-sen University, China.
    NPRC-I/O: a NoC-based real-time I/O system with reduced contention and enhanced predictability2023In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, p. 1-1Article in journal (Refereed)
    Abstract [en]

    All systems rely on inputs and outputs (I/Os) to perceive and interact with their surroundings. In safety-critical systems, it is important to guarantee both the performance and time-predictability of I/O operations. However, with the continued growth of architectural complexity in modern safety-critical systems, satisfying such real-time requirements has become increasingly challenging due to complex I/O transaction paths and extensive hardware contention. In this paper, we present a new NoC-based Predictable I/O system framework (NPRC-I/O) which reduces this contention and ensures the performance and timepredictability of I/O operations. Specifically, NPRC-I/O contains a programmable I/O command controller (NPRC-CC) and a runtime reconfigurable NoC (RNoC), which provides the capability to adjust I/O transaction paths at run-time. Using this flexibility, we construct an end-to-end transmission latency analysis and an optimisation engine that produces configurations for NPRCI/ O and the I/O traffic in a given system. The constructed analysis and optimisation engine guarantee the timing of all hard realtime traffic while reducing the deadline misses of soft real-time traffic and overall transmission latency.

  • 17.
    Johansson Hultberg, Andreas
    Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics. Umeå universitet.
    Parallellisering av Sliding Extensive Cancellation Algorithm (ECA-S) för passiv radar med OpenMP2021Independent thesis Basic level (professional degree), 10 credits / 15 HE creditsStudent thesis
    Abstract [en]

    Software parallelization has gained increasing interest since the transistor manufacturing of smaller chips within an integrated circuit has begun to stagnate. This has led to the development of new processing units with an increasing number of cores. Parallelization is an optimization technique that allows the user to utilize parallel processes in order to streamline algorithm flows. This study examines the performance benefits that a passive bistatic radar system can obtain by parallelization and code refactorization. The study focuses mainly on investigating the use of parallel instructions within a shared memory model on a Central Processing Unit (CPU) with the use of an application programming interface, namely OpenMP. Quantitative data is collected to compare the runtime of the most central algorithm in the passive radar system, namely the Extensive Cancellation Algorithm (ECA). ECA can be used to suppress unwanted clutter in the surveillance signal, which purpose is to create clear target detections of airborne objects. The algorithm on the other hand is computationally demanding, which has led to the development of faster versions such as the Sliding ECA (ECA-S). Despite the ongoing development, the algorithm is still relatively computationally demanding which can lead to long execution times within the radar system. In this study, a MATLAB implementation of ECA-S is transformed to C in order to take advantage of the fast execution time of the procedural programming language. Parallelism is introduced within the converted algorithm by the use of Intel's thread methodology and then applied within two different operating systems. The study shows that a speedup can be obtained, in the programming language C, by a factor of 24 while still ensuring the correctness of the results. The results also showed that code refactorization of a MATLAB algorithm could result in 73% faster code and that C-MEX implementations are twice as slow as a C-implementation. Finally, the study pointed out that real-time can be achieved for a passive bistatic radar system with the use of the programming language C and by using parallel instructions within a shared memory model on a CPU.

    Download full text (pdf)
    Andreas J. Hultberg examensarbete
  • 18.
    Johansson, William
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    A Comparison of CI/CD Tools on Kubernetes2022Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Kubernetes is a fast emerging technological platform for developing and operating modern IT applications. The capacity to deploy new apps and change old ones at a faster rate with less chance of error is one of the key value proposition of the Kubernetes platform. A continuous integration and continuous deployment (CI/CD) pipeline is a crucial component of the technology. Such pipelines compile all updated code and do specific tests and may then automatically deploy the produced code artifacts to a running system.

    There is a thriving ecosystem of CI/CD tools. Tools can also be divided into two types: integrated and standalone. Integrated tools will be utilized for both pipeline phases, CI and CD. The standalone tools will be used just for one of the processes, which needs the usage of two independent programs to build up the pipeline. Some tools predate Kubernetes and may be converted to operate on Kubernetes, while others are new and designed specifically for usage with Kubernetes clusters.

    CD systems are classified as push-style (artifacts from outside the cluster are pushed into the cluster) or pull-style (CD tool running inside the cluster pulling built artifacts into the cluster). Pull- and push-style pipelines will have an impact on how cluster credentials are managed and if they ever need to leave the cluster.

    This thesis investigates the deployment time, fault tolerance, and access security of pipelines. Using a simple microservices application, a testing setup is created to measure the metrics of the pipelines. Drone, Argo Workflows, ArgoCD, and GoCD are the tools compared in this study. These tools are coupled to form various pipelines.

    The pipeline using Kubernetes-specific tools, Argo Workflows and ArgoCD, is the fastest, the pipeline with GoCD is somewhat slower, and the Drone pipeline is the slowest. The pipeline that used Argo Workflows and ArgoCD could also withstand failures. Theother pipelines that used Drone and GoCD were unable to recover and timed out. Pull pipelines handles the Kubernetes access differently to push pipelines as the Kubernetes cluster credentials does not have to leave the cluster, whereas push pipelines needs the cluster credentials in the external environment where the CD tool is running.

    Download full text (pdf)
    fulltext
  • 19.
    Jonsson, Isak
    Umeå University, Faculty of Science and Technology, Computing Science.
    Recursive Blocked Algorithms, Data Structures, and High-Performance Software for Solving Linear Systems and Matrix Equations2003Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    This thesis deals with the development of efficient and reliable algorithms and library software for factorizing matrices and solving matrix equations on high-performance computer systems. The architectures of today's computers consist of multiple processors, each with multiple functional units. The memory systems are hierarchical with several levels, each having different speed and size. The practical peak performance of a system is reached only by considering all of these characteristics. One portable method for achieving good system utilization is to express a linear algebra problem in terms of level 3 BLAS (Basic Linear Algebra Subprogram) transformations. The most important operation is GEMM (GEneral Matrix Multiply), which typically defines the practical peak performance of a computer system. There are efficient GEMM implementations available for almost any platform, thus an algorithm using this operation is highly portable.

    The dissertation focuses on how recursion can be applied to solve linear algebra problems. Recursive linear algebra algorithms have the potential to automatically match the size of subproblems to the different memory hierarchies, leading to much better utilization of the memory system. Furthermore, recursive algorithms expose level 3 BLAS operations, and reveal task parallelism. The first paper handles the Cholesky factorization for matrices stored in packed format. Our algorithm uses a recursive packed matrix data layout that enables the use of high-performance matrix--matrix multiplication, in contrast to the standard packed format. The resulting library routine requires half the memory of full storage, yet the performance is better than for full storage routines.

    Paper two and tree introduce recursive blocked algorithms for solving triangular Sylvester-type matrix equations. For these problems, recursion together with superscalar kernels produce new algorithms that give 10-fold speedups compared to existing routines in the SLICOT and LAPACK libraries. We show that our recursive algorithms also have a significant impact on the execution time of solving unreduced problems and when used in condition estimation. By recursively splitting several problem dimensions simultaneously, parallel algorithms for shared memory systems are obtained. The fourth paper introduces a library---RECSY---consisting of a set of routines implemented in Fortran 90 using the ideas presented in paper two and three. Using performance monitoring tools, the last paper evaluates the possible gain in using different matrix blocking layouts and the impact of superscalar kernels in the RECSY library.

    Download full text (pdf)
    FULLTEXT01
  • 20.
    Kampik, Timotheus
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science. Signavio GmbH, Berlin, Germany.
    Najjar, Amro
    Umeå University, Faculty of Science and Technology, Department of Computing Science. AI-Robolab/ICR, Computer Science and Communications, University of Luxembourg, Esch-sur-Alzette, Luxembourg.
    Integrating Multi-agent Simulations into Enterprise Application Landscapes2019In: Highlights of Practical Applications of Survivable Agents and Multi-Agent Systems: The PAAMS Collection. PAAMS 2019 / [ed] De La Prieta F. et al., 2019, p. 100-111Conference paper (Refereed)
    Abstract [en]

    To cope with increasingly complex business, political, and economic environments, agent-based simulations (ABS) have been proposed for modeling complex systems such as human societies, transport systems, and markets. ABS enable experts to assess the influence of exogenous parameters (e.g., climate changes or stock market prices), as well as the impact of policies and their long-term consequences. Despite some successes, the use of ABS is hindered by a set of interrelated factors. First, ABS are mainly created and used by researchers and experts in academia and specialized consulting firms. Second, the results of ABS are typically not automatically integrated into the corresponding business process. Instead, the integration is undertaken by human users who are responsible for adjusting the implemented policy to take into account the results of the ABS. These limitations are exacerbated when the results of the ABS affect multi-party agreements (e.g., contracts) since this requires all involved actors to agree on the validity of the simulation, on how and when to take its results into account, and on how to split the losses/gains caused by these changes. To address these challenges, this paper explores the integration of ABS into enterprise application landscapes. In particular, we present an architecture that integrates ABS into cross-organizational enterprise resource planning (ERP) processes. As part of this, we propose a multi-agent systems simulator for the Hyperledger blockchain and describe an example supply chain management scenario type to illustrate the approach.

  • 21.
    Karlsson, Lars
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Tisseur, Francoise
    Algorithms for Hessenberg-Triangular Reduction of Fiedler Linearization of Matrix Polynomials2015In: SIAM Journal on Scientific Computing, ISSN 1064-8275, E-ISSN 1095-7197, Vol. 37, no 3, p. C384-C414Article in journal (Refereed)
    Abstract [en]

    Small- to medium-sized polynomial eigenvalue problems can be solved by linearizing the matrix polynomial and solving the resulting generalized eigenvalue problem using the QZ algorithm. The QZ algorithm, in turn, requires an initial reduction of a matrix pair to Hessenberg-triangular (HT) form. In this paper, we discuss the design and evaluation of high-performance parallel algorithms and software for HT reduction of a specific linearization of matrix polynomials of arbitrary degree. The proposed algorithm exploits the sparsity structure of the linearization to reduce the number of operations and improve the cache reuse compared to existing algorithms for unstructured inputs. Experiments on both a workstation and a high-performance computing system demonstrate that our structure-exploiting parallel implementation can outperform both the general LAPACK routine DGGHRD and the prototype implementation DGGHR3 of a general blocked algorithm.

  • 22.
    Linde, Mattias
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Multi-core scalability measurements: issues and solutions2013In: Applied Parallel and Scientific Computing: 11th International Conference, PARA 2012, Helsinki, Finland, June 10-13, 2012, Revised Selected Papers / [ed] Pekka Manninen, Per Öster, Springer Berlin/Heidelberg, 2013, p. 319-327Conference paper (Refereed)
    Abstract [en]

    We discuss how power management development in multi-core processors to achieve higher performance using automatic frequency scaling can cause artifacts when doing performance comparisons and give pessimistic efficiency estimates for algorithms. Overclocking also causes underestimates of the theoretical peak performance of the CPU as can be seen in some cases on the TOP500 list. We show that overclocking capabilities, when available, must be taken into account in thread scheduling for better overall performance.

  • 23.
    Lindström, Johan
    Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics.
    Webbaserad styrning av TV-apparater2015Independent thesis Basic level (university diploma), 10 credits / 15 HE creditsStudent thesis
    Abstract [en]

    This project is based on a wish from the employees at Umeå Energi to have a solution for remote control of TV monitors in the office. The idea is a web-based solution to make it possible for the employees to open a web page and access basic functions on the four monitors in the office. The project has been done with the help of an old PC that works as an intermediary between the monitors and the users. The PC was running a webserver that could control functions on the monitors using scripts. The server was connected to the monitors through a serial interface (RS-232). All software being used was free and open, and most of the hardware Umeå Energi already had on site, thus costs have remained low. The final solution was a Linux-based server with Ubuntu and Apache, and the scripts used Bash code. A USB hub and four USB-to-serial adapters were used to connect the monitors to the server. All basic functions worked as intended, except that commands sometimes had to be applied twice by pressing the refresh button in the web browser. This is believed to have been caused by weak power supply to the USB hub.

    Download full text (pdf)
    fulltext
  • 24.
    Lyu, Zhihan
    et al.
    Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics. Shenzhen Institutes of Advanced Technology, Chinese Academy of Science, Shenzhen 518055, China.
    Zhong, Chen
    Future Cities Laboratory Singapore, ETH Zürich 117566, Singapore.
    Feng, Liangbing
    Shenzhen Institutes of Advanced Technology, Chinese Academy of Science, Shenzhen 518055, China.
    Chen, Qian
    Shenzhen Institutes of Advanced Technology, Chinese Academy of Science, Shenzhen 518055, China.
    Feng, Shengzhong
    Shenzhen Institutes of Advanced Technology, Chinese Academy of Science, Shenzhen 518055, China.
    A high-speed index for the multi-scale overlay landscape map on ubiquitous WebGIS2013In: Shenzhen Daxue Xuebao (Ligong Ban)/Journal of Shenzhen University Science and Engineering, ISSN 1000-2618, Vol. 30, no 5, p. 480-485Article in journal (Refereed)
    Abstract [en]

    A high-speed data structure is researched to solve the problem that existing data structure can not support the multi-scale representation of landscape map data on ubiquitous WebGIS. The necessity of the index is analyzed, the algorithm presented, the possibility of supporting landscape map disscussed. In this index, the main tree is obtained from the deformation of the index structure of region quadtree partitioned on the basis of the rule of pyramid structure; a sub-tree structure supports the overlap of landscape map data. The index reflects the changes in spatial resolution according to the depth of the tree. All the nodes of the tree are containers of spatial object. For evaluation, comparative experiments are implemented using our index structure, the result shows that this index method can represent massive landscape map data effectively in WebGIS. The structure has been used in city landscape map on WebGIS and in campus landscape map on mobile phone.

  • 25.
    Malhi, Avleen
    et al.
    Bournemouth University, United Kingdom; Aalto University, Finland.
    Javed, Asad
    Basemark, Espoo, Finland.
    Yousefnezhad, Narges
    Aalto University, Espoo, Finland.
    Främling, Kary
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    IoT open messaging standards: performance comparison with MQTT and CoAP protocols2023In: Proceedings - 2023 International Conference on Future Internet of Things and Cloud, FiCloud 2023, Institute of Electrical and Electronics Engineers (IEEE), 2023, p. 130-135Conference paper (Refereed)
    Abstract [en]

    The communication protocols are the foundation for Internet of Things (IoT) for the seamless integration of hundreds of thousands of devices enabling a lightweight IoT communication network. Further, interoperability is a major concern in regard with connecting the multitude of heterogeneous devices, sensors, actuators, agents etc. Open messaging standards are designed to overcome the problem of horizontal interoperability for providing peer-to-peer communication network and real-time interaction possible in heterogeneous systems. The current research focus on the performance analysis to review the applicability of the open messaging standards available for IoT communication. In this paper, we design and implement the experiments to analyze the protocols' behaviour with respect to two performance metrics; throughput and latency. Message Queue Telemetry Transport (MQTT) and Constrained Application Protocol (CoAP) protocols are used for comparative analysis for the Open messaging standards. The evaluation is done by using various experimental scenarios to analyze performance results. It is observed that Open messaging standards performance outperforms when compared with MQTT and CoAP in IoT applications by considering various evaluation parameters. It has been analyzed that open messaging standards lead in the IoT domain which can be well explained by the obtained results.

  • 26.
    Mehta, Amardeep
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Bayuh Lakew, Ewnetu
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Tordsson, Johan
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Utility-based Allocation of Industrial IoT Applications in Mobile Edge Clouds2018In: 2018 IEEE 37th International Performance Computing and Communications Conference (IPCCC), IEEE, 2018Conference paper (Refereed)
    Abstract [en]

    Mobile Edge Clouds (MECs) create new opportunities and challenges in terms of scheduling and running applications that have a wide range of latency requirements, such as intelligent transportation systems, process automation, and smart grids. We propose a two-tier scheduler for allocating runtime resources to Industrial Internet of Things (IIoT) applications in MECs. The scheduler at the higher level runs periodically - monitors system state and the performance of applications - and decides whether to admit new applications and migrate existing applications. In contrast, the lower-level scheduler decides which application will get the runtime resource next. We use performance based metrics that tells the extent to which the runtimes are meeting the Service Level Objectives (SLOs) of the hosted applications. The Application Happiness metric is based on a single application's performance and SLOs. The Runtime Happiness metric is based on the Application Happiness of the applications the runtime is hosting. These metrics may be used for decision-making by the scheduler, rather than runtime utilization, for example. We evaluate four scheduling policies for the high-level scheduler and five for the low-level scheduler. The objective for the schedulers is to minimize cost while meeting the SLO of each application. The policies are evaluated with respect to the number of runtimes, the impact on the performance of applications and utilization of the runtimes. The results of our evaluation show that the high-level policy based on Runtime Happiness combined with the low-level policy based on Application Happiness outperforms other policies for the schedulers, including the bin packing and random strategies. In particular, our combined policy requires up to 30% fewer runtimes than the simple bin packing strategy and increases the runtime utilization up to 40% for the Edge Data Center (DC) in the scenarios we evaluated.

  • 27.
    Meng, Haitao
    et al.
    Technical University of Munich, Germany.
    Li, Changcai
    Sun Yat-sen University, China.
    Chen, Gang
    Sun Yat-sen University, China.
    Gu, Zonghua
    Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics.
    Knoll, Alois
    Technical University of Munich, Germany.
    ER3D: An Efficient Real-time 3D object detection framework for autonomous driving2023In: 2023 IEEE 29th International conference on parallel and distributed systems (ICPADS) / [ed] Cristina Ceballos, IEEE Computer Society, 2023, p. 1157-1164Conference paper (Refereed)
    Abstract [en]

    3D object detection is a vital computer vision task in mobile robotics and autonomous driving. However, most existing methods have exclusively focused on achieving high accuracy, leading to complex and bulky systems that can not be deployed in a real-time manner. In this paper, we propose the ER3D (Efficient and Real-time 3D) object detection framework, which takes stereo images as input and predicts 3D bounding boxes. Instead of using the complex network architecture, we leverage a fast-but-inaccurate method of semi-global matching (SGM) for depth estimation. To eliminate the accuracy degradation in 3D detection caused by inaccurate depth estimation, we introduce decoupled regression head and 3D distance-consistency loU loss to boost the accuracy performance of the 3D detector with a small computing overhead. ER3D achieves both high-precision and real-time performance to enable practical applications of 3D object detection systems on robotic systems. Extensive experiments with the comparison of the state of the arts demonstrate the superior practicability of ER3D, which achieves comparable detection accuracy with significant leadership on inference efficiency.

  • 28.
    Moritz, Monica
    Umeå University, Faculty of Teacher Education, Mathematics, Technology and Science Education.
    Provkonstruktion för nätet: Validerat med Bloom´s reviderade taxonomi2007Independent thesis Basic level (professional degree), 10 credits / 15 HE creditsStudent thesis
    Abstract [sv]

    Att skapa rättvisa prov är något av det svåraste som finns för alla lärare. Denna rapport redogör för ett undervisningsförsök i att praktiskt använda Bloom´s reviderade taxonomi för att validera frågorna till ett prov, som byggs upp för och genomförs på dator. Undersöknings-gruppen utgörs av mina elever som läser kursen programmering A på gymnasienivå. En norsk undersökning har tidigare visat att pojkars provresultat höjs om proven utförs på dator, vilket också visade sig bli resultatet i min undersökning. Men till skillnad från den norska undersökningen så ökade också flickornas resultat i min undersökning. Med hjälp av denna teknik att skapa prov, skulle det vara enkelt och möjligt att skapa nationella datorbaserade prov inom flera olika kurser som skulle kunna vara till hjälp för lärare ute i landet att hitta rätt kunskapsnivå på kurserna.

    Download full text (pdf)
    FULLTEXT01
  • 29.
    Nayak, Rajendra Prasad
    et al.
    Department of Computer Science and Engineering, GCEK (Govt.), Bhawanipatna, BPUT, Rourkela, India.
    Sethi, Srinivas
    Department of Computer Science Engineering and Applications, IGIT (Govt.), Sarang, BPUT, Rourkela, India.
    Bhoi, Sourav Kumar
    Department of Computer Science Engineering and Applications, PMEC (Govt.), Berhampur, BPUT, Rourkela, India.
    Sahoo, Kshira Sagar
    Umeå University, Faculty of Science and Technology, Department of Computing Science. Department of Computer Science and Engineering, SRM University, AP, Amaravati, India.
    Nayyar, Anand
    Graduate School, Faculty of Information Technology, Duy Tan University, Da Nang, Viet Nam.
    ML-MDS: Machine Learning based Misbehavior Detection System for Cognitive Software-defined Multimedia VANETs (CSDMV) in smart cities2023In: Multimedia tools and applications, ISSN 1380-7501, E-ISSN 1573-7721, Vol. 82, no 3, p. 3931-3951Article in journal (Refereed)
    Abstract [en]

    Security is a major concern in vehicular networks for reliable communication between the source and the destination in smart cities. Data, these days, is in the form of safety or non-safety messages in formats like text, audio, images, video, etc. These information exchanges between the two parties need to be updated with a trust value (TV) by analyzing the communication data. In this paper, a machine learning-based misbehavior detection system (ML-MDS) is proposed for cognitive software-defined multimedia vehicular networks (CSDMV) in smart cities. In the proposed system, before communication, the vehicle must be aware of the TV of other vehicles. If the TV for a vehicle is higher than a threshold (th), then the communication happens and the whole transaction information is sent to the local software-defined network controller (LSDNC) for classification of behavior using the ML algorithm. After this, the TV is updated as per the last transaction status at LSDNC and the updated TV of the vehicle is sent to the main SDN controller for information gathering. In this system, the best ML algorithm for the ML-MDS model is selected by considering decision tree, support vector machine (SVM), neural network (NN), and logistic regression (LR) algorithms. The classification accuracy performance is evaluated using UNSW_NB-15 standard dataset for detecting the normal and malicious vehicles. NN shows better classification accuracy than other algorithms. The proposed ML-MDS is implemented and evaluated using OMNeT++ network simulator and the Simulation of Urban Mobility (SUMO) road traffic simulator by considering various parameters such as detection accuracy, detection time, and energy consumption. From the results, it is observed that the detection accuracy of proposed ML-MDS system is 98.4% as compared to Grover et al. scheme which was 80.2%. Also, for scalability issue the dataset size is increased and performance is evaluated in Orange 3.26.0 machine analytics tool and NN is found to be the best algorithm which shows high accuracy in detecting the attackers.

  • 30.
    Nilsson, Peter
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Reveman, David
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Glitz: hardware accelerated image compositing using OpenGL2004In: USENIX association proceedings of the FREENIX track, 2004 USENIX annual technical conference, Berkeley: USENIX - The Advanced Computing Systems Association, 2004, p. 29-40Conference paper (Refereed)
    Abstract [en]

    In recent years 2D graphics applications and window systems tend to use more demanding graphics features such as alpha blending, image transformations and anti-aliasing. These features contribute to the user interfaces by making it possible to add more visual effects as well as new usable functionalities. All together it makes the graphical interface a more hospitable, as well as efficient, environment for the user. Even with today's powerful computers these tasks constitute a heavy burden on the CPU. This is why many proprietary window systems have developed powerful 2D graphics engines to carry out these tasks by utilizing the acceleration capabilities in modem graphics hardware. We present Glitz, an open source implementation of such a graphics engine, a portable 2D graphics library that can be used to render hardware accelerated graphics. Glitz is layered on top of OpenGL and is designed to act as an additional backend for cairo, providing it with hardware accelerated output. Further-more, an effort has been made to investigate if the level of hardware acceleration provided by the X Window System can be improved by using Glitz to carry out its fundamental drawing operations.

  • 31.
    Ojeda-May, Pedro
    et al.
    Umeå University, Faculty of Science and Technology, High Performance Computing Center North (HPC2N). Umeå University, Faculty of Science and Technology, Department of Chemistry.
    Nam, Kwangho
    Umeå University, Faculty of Science and Technology, Department of Chemistry. Department of Chemistry and Biochemistry, University of Texas at Arlington, Arlington, Texas 76019-0065, United States.
    Acceleration of Semiempirical QM/MM Methods through Message Passage Interface (MPI), Hybrid MPI/Open Multiprocessing, and Self-Consistent Field Accelerator Implementations2017In: Journal of Chemical Theory and Computation, ISSN 1549-9618, E-ISSN 1549-9626, Vol. 13, no 8, p. 3525-3536Article in journal (Refereed)
    Abstract [en]

    The strategy and implementation of scalable and efficient semiempirical (SE) QM/MM methods in. CHARMM are described. The serial version of the code was first profiled to identify routines that required parallelization. Afterward, the code was parallelized and accelerated with three approaches. The first approach was the parallelization of the entire QM/MM routines, including the Fock matrix diagonalization routines, using the CHARMM message passage interface (MPI) machinery. In the second approach, two different self-consistent.field (SCF) energy convergence accelerators were implemented using density and Pock matrices as targets for their extrapolations in the SCF procedure. In the third approach, the entire QM/MM and MM energy routines were accelerated by implementing the hybrid MPI/open multiprocessing (OpenMP) model in which both the task- and loop-leveL parallelitation strategies were adopted to balance loads between different OpenMP threads. The present implementation was tested on two solvated enzyme systems (including <100 QM atoms) and an S(N)2 symmetric reaction in water. The-MPI version exceeded existing SE QM methods in CHARMM which include the SCC-DFTB and SQUANTUM methods by at least 4-fold. The use of SCF convergence accelerators further accelerated,the code by similar to 12-35% depending on the size of the QM region and the number of CPU cores used. Although the MPI version displayed good scalability, the performance was diminished for large numbers of MPI processes due to the overhead associated with MPI communications between nodes. This issue was partially overcome by the hybrid MPI/OpenMP approach which displayed a better scalability for a larger number of CPU cores (up to 64 CPUs in the tested systems).

  • 32.
    Rambaran, Theresa
    et al.
    Umeå University, Faculty of Medicine, Department of Public Health and Clinical Medicine, Section of Sustainable Health.
    Schirhagl, Romana
    Department of Biomedical Engineering, University Medical Center Groningen, Groningen University, Groningen, Netherlands.
    Nanotechnology from lab to industry - a look at current trends2022In: Nanoscale Advances, E-ISSN 2516-0230, Vol. 4, no 18, p. 3664-3675Article, review/survey (Refereed)
    Abstract [en]

    Nanotechnology holds great promise and is hyped by many as the next industrial evolution. Medicine, food and cosmetics, agriculture and environmental health, and technology industries already profit from nanotechnology innovations and their influence is expected to increase drastically in the near future. However, there are also many challenges that need to be overcome to bring a nanotechnological product or business to the market. In this article we discuss current examples of nanotechnology that have been successfully introduced in the market and their relevance and geographical spread. We then discuss different partners for scientists and their role in the commercialization process. Finally, we review the different steps it takes to bring a nanotechnology to the market, highlight the many difficulties related to these steps, and provide a roadmap for the journey from lab to industry which can be beneficial to researchers.

    Download full text (pdf)
    fulltext
  • 33.
    Ren, Keni
    et al.
    Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics.
    Karlsson, Johannes
    Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics.
    Liuska, Markus
    Hartikainen, Markku
    Hansen, Inger
    Jorgensen, Grete H. M.
    A sensor-fusion-system for tracking sheep location and behaviour2020In: International Journal of Distributed Sensor Networks, ISSN 1550-1329, E-ISSN 1550-1477, Vol. 16, no 5Article in journal (Refereed)
    Abstract [en]

    The growing interest in precision livestock farming is prompted by a desire to understand the basic behavioural needs of the animals and optimize the contribution of each animal. The aim of this study was to develop a system that automatically generated individual animal behaviour and localization data in sheep. A sensor-fusion-system tracking individual sheep position and detecting sheep standing/lying behaviour was proposed. The mean error and standard deviation of sheep position performed by the ultra-wideband location system was 0.357 +/- 0.254 m, and the sensitivity of the sheep standing and lying detection performed by infrared radiation cameras and three-dimenional computer vision technology were 98.16% and 100%, respectively. The proposed system was able to generate individual animal activity reports and the real-time detection was achieved. The system can increase the convenience for animal behaviour studies and monitoring of animal welfare in the production environment.

    Download full text (pdf)
    fulltext
  • 34.
    Rezk, Nesma
    et al.
    Halmstad University.
    Purnaprajna, Madhura
    Amrita School of Engineering: Bangalore, Karnataka, India.
    Nordström, Tomas
    Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics.
    Ul-Abdin, Zain
    Halmstad University.
    Recurrent Neural Networks: An Embedded Computing Perspective2020In: IEEE Access, E-ISSN 2169-3536, Vol. 81, no 1, p. 57967-57996Article in journal (Refereed)
    Abstract [en]

    Recurrent Neural Networks (RNNs) are a class of machine learning algorithms used for applications with time-series and sequential data. Recently, there has been a strong interest in executing RNNs on embedded devices. However, difficulties have arisen because RNN requires high computational capability and a large memory space. In this paper, we review existing implementations of RNN models on embedded platforms and discuss the methods adopted to overcome the limitations of embedded systems. We will define the objectives of mapping RNN algorithms on embedded platforms and the challenges facing their realization. Then, we explain the components of RNN models from an implementation perspective. We also discuss the optimizations applied to RNNs to run efficiently on embedded platforms. Finally, we compare the defined objectives with the implementations and highlight some open research questions and aspects currently not addressed for embedded RNNs. Overall, applying algorithmic optimizations to RNN models and decreasing the memory access overhead is vital to obtain high efficiency. To further increase the implementation efficiency, we point up the more promising optimizations that could be applied in future research. Additionally, this article observes that high performance has been targeted by many implementations, while flexibility has, as yet, been attempted less often. Thus, the article provides some guidelines for RNN hardware designers to support flexibility in a better manner.

    Download full text (pdf)
    fulltext
  • 35.
    Risberg, Emil
    Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics.
    Front loader task automation with distance measurement2018Independent thesis Basic level (professional degree), 10 credits / 15 HE creditsStudent thesis
    Abstract [en]

    Front loaders are used in many agricultural applications. They are used in daily tasks that are performed by operators. The operator uses multiple different front loader equipment carrying out tasks, which are more or less repetitive. A common thing between these tasks even though different equipment is used, is that the working cycle is similar.

    A working cycle consists of multiple sequences. One of them involves loader control for desired object handling. If that part of the working cycle could be automated it could reduce the workload and operator stress, which would increase an operator’s daily working capacity.

    The aim with the thesis is to research the possibility to automate a part of the front loader working cycle. The goal is to create a prototype that can be used to automate the loader control part of the working cycle. The prototype will be implemented on a front loader for testing of a working cycle.

    To achieve the aim and goal, work will include research about which sensor technology that is best suited for the prototype. It will also include tests to see if there is a difference in accuracy when using a cheap or an expensive sensor and if it is possible to automate a part of the front loader working cycle.

    A sensor analysis was made and the ultrasonic sensor technology was chosen for the prototype. One expensive prototype sensor and one cheap extra sensor for comparison testing were chosen. Software was written for the sensors and they were tested on a test bench. The prototype was implemented on a front loader for test of a working cycle.

    The prototype can measure distance and send the required commands based on that. This indicates that it is possible to automate a part of the front loader working cycle.

    Download full text (pdf)
    fulltext
  • 36.
    Saleh Sedghpour, Mohammad Reza
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Garlan, David
    Carnegie Mellon University, Pittsburgh, USA.
    Schmerl, Bradley
    Carnegie Mellon University, Pittsburgh, USA.
    Klein, Cristian
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Tordsson, Johan
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Breaking the vicious circle: self-adaptive microservice circuit breaking and retry2023In: 2023 IEEE international conference on cloud engineering: proceedings / [ed] Lisa O’Conner, IEEE Computer Society, 2023, p. 32-42, article id 24126172Conference paper (Refereed)
    Abstract [en]

    Microservice-based architectures consist of numerous, loosely coupled services with multiple instances. Service meshes aim to simplify traffic management and prevent microservice overload through circuit breaking and request retry mechanisms. Previous studies have demonstrated that the static configuration of these mechanisms is unfit for the dynamic environment of microservices. We conduct a sensitivity analysis to understand the impact of retrying across a wide range of scenarios. Based on the findings, we propose a retry controller that can also work with dynamically configured circuit breakers. We have empirically assessed our proposed controller in various scenarios, including transient overload and noisy neighbors while enforcing adaptive circuit breaking. The results show that our proposed controller does not deviate from a well-tuned configuration while maintaining carried response time and adapting to the changes. In comparison to the default static retry configuration that is mostly used in practice, our approach improves the carried throughput up to 12x and 32x respectively in the cases of transient overload and noisy neighbors.

  • 37.
    Sing, Ranumayee
    et al.
    Faculty of Engineering (Computer Science and Engineering), BPUT, Odisha, Rourkela, India.
    Bhoi, Sourav Kumar
    Department of Computer Science and Engineering, Parala Maharaja Engineering College (Govt.), Odisha, Berhampur, India.
    Panigrahi, Niranjan
    Department of Computer Science and Engineering, Parala Maharaja Engineering College (Govt.), Odisha, Berhampur, India.
    Sahoo, Kshira Sagar
    Umeå University, Faculty of Science and Technology, Department of Computing Science. Department of Computer Science and Engineering, SRM University, Andhra Pradesh, Amaravati, India.
    Bilal, Muhammad
    Department of Computer Engineering, Hankuk University of Foreign Studies, Yongin-si, South Korea.
    Shah, Sayed Chhattan
    Department of Information and Communication Engineering, Hankuk University of Foreign Studies, Yongin-si 17035, South Korea.
    Emcs: an energy-efficient makespan cost-aware scheduling algorithm using evolutionary learning approach for cloud-fog-based IoT applications2022In: Sustainability, E-ISSN 2071-1050, Vol. 14, no 22, article id 15096Article in journal (Refereed)
    Abstract [en]

    The tremendous expansion of the Internet of Things (IoTs) has generated an enormous volume of near and remote sensing data, which is increasing with the emergence of new solutions for sustainable environments. Cloud computing is typically used to help resource-constrained IoT sensing devices. However, the cloud servers are placed deep within the core network, a long way from the IoT, introducing immense data transactions. These transactions require heavy electricity consumption and release harmful (Formula presented.) to the environment. A distributed computing environment located at the edge of the network named fog computing has been promoted to reduce the limitation of cloud computing for IoT applications. Fog computing potentially processes real-time and delay-sensitive data, and it reduces the traffic, which minimizes the energy consumption. The additional energy consumption can be reduced by implementing an energy-aware task scheduling, which decides on the execution of tasks at cloud or fog nodes on the basis of minimum completion time, cost, and energy consumption. In this paper, an algorithm called energy-efficient makespan cost-aware scheduling (EMCS) is proposed using an evolutionary strategy to optimize the execution time, cost, and energy consumption. The performance of this work is evaluated using extensive simulations. Results show that EMCS is 67.1% better than cost makespan-aware scheduling (CMaS), 58.79% better than Heterogeneous Earliest Finish Time (HEFT), 54.68% better than Bees Life Algorithm (BLA) and 47.81% better than Evolutionary Task Scheduling (ETS) in terms of makespan. Comparing the cost of the EMCS model, it uses 62.4% less cost than CMaS, 26.41% less than BLA, and 6.7% less than ETS. When comparing energy consumption, EMCS consumes 11.55% less than CMaS, 4.75% less than BLA and 3.19% less than ETS. Results also show that with an increase in the number of fog and cloud nodes, the balance between cloud and fog nodes gives better performance in terms of makespan, cost, and energy consumption.

    Download full text (pdf)
    fulltext
  • 38.
    Sjödin, Jonas
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Building OCI Images With a Container Orchestrator: A comparison of OCI build-tools2021Independent thesis Advanced level (degree of Master (Two Years)), 300 HE creditsStudent thesis
    Abstract [en]

    Cloud computing is a quickly growing field in modern computing science where new technologies arise every day. One of the latest trends in cloud computing is container based technology, which  allows applications to run in a reproducible and stateless fashion without requiring manually installed dependencies. Another trend in computer science is DevOps, a methodology where developers take part in the operations process. DevOps popularise the use of CI/CD workflows, where automatic pipelines run tests and scripts on new code. A container orchestrator, like Kubernetes, can be used to control and modify containers. Kubernetes allows integrating multiple third-party applications that can monitor performance, analyze logs, and much more. Kubernetes can be integrated into the CI/CD system to utilise its container orchestration perks. Building containers inside a container can cause security issues because of native security flaws with OCI build tools. This thesis aims to look at these issues and analyse the field of container orchestrated OCI build tools using Kubernetes and OCI build tools. We also discover how to develop a test suite that can reliably test container orchestrated OCI build tools and export metrics. The thesis lastly compares different Dockerfile compliant Build tools with the test suite to find out which has the best performance and caching. The compared build tools are BuildKit, Kaniko, Img and Buildah and overall BuildKit and Kaniko are the fastest and most resource effective build tools. It is not obvious which build tool that is the most secure. Kaniko, which is a root container requires no privileges and is therefore tough to break out of but an eventual breakout will give the attacker root access to the host machine. BuildKit and Img only requires unconfined SECcomp and AppArmor which will make a container breakout more probable, even though less than Buildah which must be run in a privileged container. Since they can run rootless, the attacker will only have the same access to the host as that user in case of a container breakout.  

    Download full text (pdf)
    fulltext
  • 39.
    Stenman, Johan
    Umeå University, Faculty of Teacher Education, Interactive Media and Learning.
    Den interaktiva tavlan: En studie av dess användningsområde i två undersökta skolor2005Independent thesis Basic level (professional degree), 10 credits / 15 HE creditsStudent thesis
    Abstract [sv]

    I detta arbete belyser jag den interaktiva tavlans användningsområde i två svenska skolor, samt beskriver hur den används i andra länder som exempelvis Storbritannien. Syftet var även att undersöka på vilka punkter den skiljde sig från en vanlig whiteboard och även vilka problem som kunde uppstå vid användningen av den interaktiva tavlan i undervisningen. Undersökningen i de svenska skolorna genomfördes i form av en intervju med sex lärare som alla använde den interaktiva tavlan i sin undervisning. Kontakten med lärarna inleddes via brevväxling per e-post och avslutades med en telefonintervju.

    Resultatet visar att den interaktiva tavlan främst användes till att visa bilder och animationer som ett komplement till skolans traditionella whiteboard eftersom den interaktiva tavlan kändes för onaturlig att skriva på. Alla lärare i undersökningen betonade dock att den interaktiva tavlan fått ett positivt mottagande av eleverna, där interaktiviteten och användningen av bilder och ljud som tillfördes till undervisningen sågs som den största styrkan tillsammans med möjligheten att kunna stå framför klassen och styra datorn. Hälften av lärarna påpekade att de svårigheter som uppstått med den interaktiva tavlan lätt kan härledas till deras bristande teknikkunskaper. De anser därför att en utbildningskurs för att utvidga kunskaperna om tavlans funktioner är nödvändig, samt tillgången till teknisk support ifall utrustningen skulle drabbas av problem. Detta anser jag vara den största begränsningen med den interaktiva tavlan, de problem som eventuellt kan uppstå måste kunna avhjälpas för att tavlan skall kunna användas obehindrat i undervisningen.

    Download full text (pdf)
    FULLTEXT01
  • 40.
    Strandberg, Joakim
    Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics.
    Intelligent valutarobot2020Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
    Abstract [sv]

    I dagens samhälle har informationsflödet ökat dramatiskt bara de senaste åren. Vi ser det

    genom sociala medier och en utveckling mot att vi ska vara konstant uppkopplade mot

    internet och olika tjänster, som bombarderar oss med ny information. En digital verklighet

    som ställer högre krav och som kräver mer och mer interaktion från oss användare och

    system som är uppdaterade. Utveckling har även tagit fart inom Artificell Intelligence och

    machine learning som används för att vi som användare ska kunna få information som vi

    förväntas vilja ha. Umeå universitet handlar frekvent med andra länder i olika valutor. Den

    dagen fakturorna i utländska valutor betalas sätter banken växelkursen för varje valuta.

    Skillnaden mellan bankens valutakurser och de som finns i ekonomisystemet måste bokas

    om manuellt vilket skapar ett behov att ha sina valutakurser uppdaterade. Dagsaktuella valutakurser

    finns att hämta via ett stort antal så kallade application programming interfaces

    (API)er världen över.

    Denna uppsats handlar om framtagandet av en applikation i programmeringsspråket Python som ska kunna hämta valutakurser med en viss frekvens från ett antal olika

    API beroende på användarens val. Applikationen ska göra en rimlighetsbedömning av

    valutakurserna innan de läses in i ekonomisystemet, detta för att säkerställa korrekta och

    uppdaterade data. Hela processen ska vara automatisk efter det att användaren gör ett val

    av API i applikationen. En ytterligare möjlighet med applikationen är att använda historiska

    data för prediktion av valutakursen vid en framtida tidpunkt. Jag har i denna uppsats

    valt att använda gratisversioner, men det finns naturligtvis möjlighet att få mer eller mindre

    data i realtid om man väljer betalversionerna och sålunda skapa ännu bättre analysmaterial

    för mer exakta prediktioner av framtida valutakurser.

    Download full text (pdf)
    fulltext
  • 41.
    Tiwari, Devisha
    et al.
    Department of Computer Science and Engineering, National Institute of Technology, Ashok Rajpath, Bihar, Patna, India.
    Mondal, Bhaskar
    Department of Computer Science and Engineering, National Institute of Technology, Ashok Rajpath, Bihar, Patna, India.
    Singh, Anil
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Fast encryption scheme for secure transmission of e-healthcare images2023In: International Journal of Image, Graphics and Signal Processing, ISSN 2074-9074, E-ISSN 2074-9082, Vol. 15, no 5, p. 88-99Article in journal (Refereed)
    Abstract [en]

    E-healthcare systems (EHSD), medical communications, digital imaging (DICOM) things have gained popularity over the past decade as they have become the top contenders for interoperability and adoption as a global standard for transmitting and communicating medical data. Security is a growing issue as EHSD and DICOM have grown more usable on any-to-any devices. The goal of this research is to create a privacy-preserving encryption technique for EHSD rapid communication with minimal storage. A new 2D logistic-sine chaotic map (2DLSCM) is used to design the proposed encryption method, which has been developed specifically for peer-to-peer communications via unique keys. Through the 3D Lorenz map which feeds the initial values to it, the 2DLSCM is able to provide a unique keyspace of 2544 bits (2^544bits) in each go of peer-to-peer paired transmission. Permutation-diffusion design is used in the encryption process, and 2DLSCM with 3DLorenz system are used to generate unique initial values for the keys. Without interfering with real-time medical transmission, the approach can quickly encrypt any EHSD image and DICOM objects. To assess the method, five distinct EHSD images of different kinds, sizes, and quality are selected. The findings indicate strong protection, speed, and scalability when compared to existing similar methods in literature.

    Download full text (pdf)
    fulltext
  • 42. Tärneberg, William
    et al.
    Mehta, Amardeep
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Wadbro, Eddie
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Tordsson, Johan
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Eker, Johan
    Kihl, Maria
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Dynamic application placement in the Mobile Cloud Network2017In: Future generations computer systems, ISSN 0167-739X, E-ISSN 1872-7115, Vol. 70, p. 163-177Article in journal (Refereed)
    Abstract [en]

    To meet the challenges of consistent performance, low communication latency, and a high degree of user mobility, cloud and Telecom infrastructure vendors and operators foresee a Mobile Cloud Network that incorporates public cloud infrastructures with cloud augmented Telecom nodes in forthcoming mobile access networks. A Mobile Cloud Network is composed of distributed cost- and capacityheterogeneous resources that host applications that in turn are subject to a spatially and quantitatively rapidly changing demand. Such an infrastructure requires a holistic management approach that ensures that the resident applications’ performance requirements are met while sustainably supported by the underlying infrastructure. The contribution of this paper is three-fold. Firstly, this paper contributes with a model that captures the cost- and capacity-heterogeneity of a Mobile Cloud Network infrastructure. The model bridges the Mobile Edge Computing and Distributed Cloud paradigms by modelling multiple tiers of resources across the network and serves not just mobile devices but any client beyond and within the network. A set of resource management challenges is presented based on this model. Secondly, an algorithm that holistically and optimally solves these challenges is proposed. The algorithm is formulated as an application placement method that incorporates aspects of network link capacity, desired user latency and user mobility, as well as data centre resource utilisation and server provisioning costs. Thirdly, to address scalability, a tractable locally optimal algorithm is presented. The evaluation demonstrates that the placement algorithm significantly improves latency, resource utilisation skewness while minimising the operational cost of the system. Additionally, the proposed model and evaluation method demonstrate the viability of dynamic resource management of the Mobile Cloud Network and the need for accommodating rapidly mobile demand in a holistic manner.

  • 43. Vreča, Jure
    et al.
    Sturm, Karl J. X.
    Gungl, Ernest
    Merchant, Farhad
    Bientinesi, Paolo
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Leupers, Rainer
    Brezočnik, Zmago
    Accelerating Deep Learning Inference in Constrained Embedded Devices Using Hardware Loops and a Dot Product Unit2020In: IEEE Access, E-ISSN 2169-3536, Vol. 8, p. 165913-165926Article in journal (Refereed)
    Abstract [en]

    Deep learning algorithms have seen success in a wide variety of applications, such as machine translation, image and speech recognition, and self-driving cars. However, these algorithms have only recently gained a foothold in the embedded systems domain. Most embedded systems are based on cheap microcontrollers with limited memory capacity, and, thus, are typically seen as not capable of running deep learning algorithms. Nevertheless, we consider that advancements in compression of neural networks and neural network architecture, coupled with an optimized instruction set architecture, could make microcontroller-grade processors suitable for specific low-intensity deep learning applications. We propose a simple instruction set extension with two main components-hardware loops and dot product instructions. To evaluate the effectiveness of the extension, we developed optimized assembly functions for the fully connected and convolutional neural network layers. When using the extensions and the optimized assembly functions, we achieve an average clock cycle count decrease of 73% for a small scale convolutional neural network. On a per layer base, our optimizations decrease the clock cycle count for fully connected layers and convolutional layers by 72% and 78%, respectively. The average energy consumption per inference decreases by 73%. We have shown that adding just hardware loops and dot product instructions has a significant positive effect on processor efficiency in computing neural network functions.

  • 44.
    Wikstrand, Greger
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Nilsson, Thomas
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Dougherty, Mark S
    Prioritized repeated eliminations multiple access: a novel protocol for wireless networksManuscript (preprint) (Other academic)
  • 45.
    Zhang, Cheng
    et al.
    College of Computer Science and Electronic Engineering, Hunan University, Changsha, China; Key Laboratory of Blockchain and Cyberspace Governance of Zhejiang Province.
    Xu, Yang
    College of Computer Science and Electronic Engineering, Hunan University, Changsha, China; Key Laboratory of Blockchain and Cyberspace Governance of Zhejiang Province.
    Elahi, Haroon
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Zhang, Deyu
    School of Computer Science and Engineering, Central South University, Changsha, China.
    Tan, Yunlin
    College of Computer Science and Electronic Engineering, Hunan University, Changsha, China.
    Chen, Junxian
    School of Electronic Information, Hunan University, Changsha, China.
    Zhang, Yaoxue
    College of Computer Science and Electronic Engineering, Hunan University, Changsha, China; Department of Computer Science and Technology, Tsinghua University, Beijing, China.
    A Blockchain-based Model Migration Approach for Secure and Sustainable Federated Learning in IoT Systems2023In: IEEE Internet of Things Journal, ISSN 2327-4662, Vol. 10, no 8, p. 6574-6585Article in journal (Refereed)
    Abstract [en]

    Model migration can accelerate model convergence during federated learning on the Internet of Things (IoT) devices and reduce training costs by transferring feature extractors from fast to slow devices, which, in turn, enables sustainable computing. However, malicious or lazy devices may migrate the fake models or resist sharing models for their benefit, reducing the desired efficiency and reliability of a federated learning system. To this end, this work presents a blockchain-based model migration approach for resource-constrained IoT systems. The proposed approach aims to achieve secure model migration and speed up model training while minimizing computation cost. We first develop an incentive mechanism considering the economic benefits of fast devices, which breaks the Nash equilibrium established by lazy devices and encourages capable devices to train and share models. Second, we design a clustering-based algorithm for identifying malicious devices and preventing them from defrauding incentives. Third, we use blockchain to ensure trustworthiness in model migration and incentive processes. Blockchain records the interaction between the central server and IoT devices and runs the incentive algorithm without exposing the devices&#x2019; private data. Theoretical analysis and experimental results show that the proposed approach can accelerate federated learning rates, reduce model training computation costs to increase sustainability, and resist malicious attacks.

  • 46.
    Zhang, Yi-Wen
    et al.
    College of Computer Science and Technology, Huaqiao University, China.
    Chen, Rong-Kun
    College of Computer Science and Technology, Huaqiao University, China.
    Gu, Zonghua
    Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics.
    Energy-Aware Partitioned Scheduling of Imprecise Mixed-Criticality Systems2023In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, p. 1-1Article in journal (Refereed)
    Abstract [en]

    We consider partitioned scheduling of an Imprecise Mixed-Criticality (IMC) taskset on a uniform multiprocessor platform, with Earliest Deadline First-Virtual Deadline (EDF-VD) as the uniprocessor task scheduling algorithm, and address the optimization problem of finding a feasible task-to-processor assignment and low-criticality (LO) mode processor speed with the objective of minimizing the system&#x2019;s average energy consumption in LO mode. We propose a task-to-processor assignment algorithm Criticality-Unaware Worst-Fit Decreasing (CU-WFD) algorithm, which allocates tasks with the Worst-Fit Decreasing (WFD) heuristic method based on utilization values at their respective criticality levels. We determine the energy-efficient speed for each processor based on EDF-VD scheduling, and present our algorithm Energy-Efficient Partitioned Scheduling for Imprecise Mixed-Criticality (EEPSIMC) with the CU-WFD heuristic algorithm to minimize system energy consumption. The experimental results show that our proposed algorithm has good performance in terms both schedulability ratio and normalized energy consumption compared to seven comparison baselines.

  • 47.
    Zhang, Yi-Wen
    et al.
    College of Computer Science and Technology, Huaqiao University, China.
    Ma, Jin-Peng
    College of Computer Science and Technology, Huaqiao University, China.
    Zheng, Hui
    College of Computer Science and Technology, Huaqiao University, China.
    Gu, Zonghua
    Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics.
    Criticality-aware EDF scheduling for constrained-deadline imprecise mixed-criticality systems2024In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, Vol. 43, no 2, p. 480-491Article in journal (Refereed)
    Abstract [en]

    EDF-VD first focuses on the classic mixed-criticality task model in which all low criticality (LO) tasks are abandoned in the high criticality mode, which is an effective dynamic priority scheduling algorithm for mixed-criticality systems. However, it has low schedulability for the imprecise mixed-criticality (IMC) task model with constrained-deadlines, in which LO tasks are provided graceful degradation services instead of being abandoned. In this paper, we study how to improve schedulability for the IMC tasks model. First, we propose a novel criticality-aware EDF scheduling algorithm (CA-EDF) that tries to delay the LO task execution to improve schedulability. Second, we derive sufficient conditions of schedulability for CA-EDF based on the Demand Bound Function. Finally, we evaluate CA-EDF through extensive simulation. The experimental results indicate that CA-EDF can improve the schedulability ratio by about 13.10% compared to the existing algorithms.

  • 48.
    Zhang, Yi-Wen
    et al.
    College of Computer Science and Technology, Huaqiao University, China.
    Zheng, Hui
    College of Computer Science and Technology, Huaqiao University, China.
    Gu, Zonghua
    Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics.
    EDF-based energy-efficient semi-clairvoyant scheduling with graceful degradation2024In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, Vol. 43, no 2, p. 468-479Article in journal (Refereed)
    Abstract [en]

    Recent works introduce a semi-clairvoyant model, in which the system mode transition is revealed on the arrival of high criticality jobs. To solve the problem of inconsistency between the correctness criterion for mixed-criticality systems (MCS) with a semi-clairvoyant and the actual situation, we study the problem of schedulability and energy in MCS with the semi-clairvoyant model in this paper. First, we propose a new correctness criterion for MCS with semi-clairvoyant and graceful degradation and develop the schedulability test based on Demand Bound Function methods denoted as SCS-GD. Second, we propose an energy-efficient semi-clairvoyant scheduling algorithm based on SCS-GD denoted as EE-SCS-GD. Finally, we conduct an experimental evaluation of SCS-GD and EE-SCS-GD by synthetically generated task sets. The experimental results show that SCS-GD can improve the schedulability ratio by 5.98% compared to existing algorithms while EE-SCS-GD can save 56.17% energy compared to SCS-GD.

  • 49.
    Zhao, Qingling
    et al.
    The PCA Lab, School of Computer Science and Engineering, Nanjing University of Science and Technology, Systems for High-Dimensional Information of Ministry of Education, Jiangsu Key Lab of Image and Video Understanding for Social Security, Jiangsu, Nanjing, China.
    Chen, Mingqiang
    The PCA Lab, School of Computer Science and Engineering, Nanjing University of Science and Technology, Systems for High-Dimensional Information of Ministry of Education, Jiangsu Key Lab of Image and Video Understanding for Social Security, Jiangsu, Nanjing, China.
    Gu, Zonghua
    Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics.
    Luan, Siyu
    Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics.
    Zeng, Haibo
    Department of Electrical and Computer Engineering, Virginia Tech, VA, Blacksburg, United States.
    Chakrabory, Samarjit
    Department of Computer Science, University of North Carolina, NC, Chapel Hill, United States.
    CAN bus intrusion detection based on auxiliary classifier GAN and out-of-distribution detection2022In: ACM Transactions on Embedded Computing Systems, ISSN 1539-9087, E-ISSN 1558-3465, Vol. 21, no 4, article id 45Article in journal (Refereed)
    Abstract [en]

    The Controller Area Network (CAN) is a ubiquitous bus protocol present in the Electrical/Electronic (E/E) systems of almost all vehicles. It is vulnerable to a range of attacks once the attacker gains access to the bus through the vehicle's attack surface. We address the problem of Intrusion Detection on the CAN bus and present a series of methods based on two classifiers trained with Auxiliary Classifier Generative Adversarial Network (ACGAN) to detect and assign fine-grained labels to Known Attacks and also detect the Unknown Attack class in a dataset containing a mixture of (Normal + Known Attacks + Unknown Attack) messages. The most effective method is a cascaded two-stage classification architecture, with the multi-class Auxiliary Classifier in the first stage for classification of Normal and Known Attacks, passing Out-of-Distribution (OOD) samples to the binary Real-Fake Classifier in the second stage for detection of the Unknown Attack class. Performance evaluation demonstrates that our method achieves both high classification accuracy and low runtime overhead, making it suitable for deployment in the resource-constrained in-vehicle environment.

1 - 49 of 49
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf