umu.sePublications
Change search
Refine search result
1234 1 - 50 of 169
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Ali-Eldin, Ahmed
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Kihl, Maria
    Lund University.
    Tordsson, Johan
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Analysis and characterization of a Video-on-Demand service workload2015In: Proceedings of the 6th ACM Multimedia Systems Conference, MMSys 2015, ACM Digital Library, 2015, p. 189-200Conference paper (Refereed)
    Abstract [en]

    Video-on-Demand (VoD) and video sharing services accountfor a large percentage of the total downstream Internet traf-fic. In order to provide a better understanding of the loadon these services, we analyze and model a workload tracefrom a VoD service provided by a major Swedish TV broad-caster. The trace contains over half a million requests gener-ated by more than 20000 unique users. Among other things,we study the request arrival rate, the inter-arrival time, thespikes in the workload, the video popularity distribution, thestreaming bit-rate distribution and the video duration distri-bution. Our results show that the user and the session ar-rival rates for the TV4 workload does not follow a Poissonprocess. The arrival rate distribution is modeled using a log-normal distribution while the inter-arrival time distributionis modeled using a stretched exponential distribution. Weobserve the “impatient user” behavior where users abandonstreaming sessions after minutes or even seconds of startingthem. Both very popular videos and non-popular videos areparticularly affected by impatient users. We investigate ifthis behavior is an invariant for VoD workloads.

  • 2.
    Ali-Eldin, Ahmed
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Kihl, Maria
    Dept. of Electrical and Information Technology, Lund University, Lund, Sweden.
    Tordsson, Johan
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Efficient provisioning of bursty scientific workloads on the cloud using adaptive elasticity control2012In: Proceedings of the 3rd workshop on Scientific Cloud Computing Date, Association for Computing Machinery (ACM), 2012, p. 31-40Conference paper (Refereed)
    Abstract [en]

    Elasticity is the ability of a cloud infrastructure to dynamically change theamount of resources allocated to a running service as load changes. We build anautonomous elasticity controller that changes the number of virtual machinesallocated to a service based on both monitored load changes and predictions offuture load. The cloud infrastructure is modeled as a G/G/N queue. This modelis used to construct a hybrid reactive-adaptive controller that quickly reactsto sudden load changes, prevents premature release of resources, takes intoaccount the heterogeneity of the workload, and avoids oscillations. Using simulations with Web and cluster workload traces, we show that our proposed controller lowers the number of delayed requests by a factor of 70 for the Web traces and 3 for the cluster traces when compared to a reactive controller. Ourcontroller also decreases the average number of queued requests by a factor of 3 for both traces, and reduces oscillations by a factor of 7 for the Web traces and 3 for the cluster traces. This comes at the expense of between 20% and 30% over-provisioning, as compared to a few percent for the reactive controller.

  • 3.
    Ali-Eldin, Ahmed
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Rezaie, Ali
    Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.
    Mehta, Amardeep
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Razroev, Stanislav
    Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.
    Sjöstedt-de Luna, Sara
    Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.
    Seleznjev, Oleg
    Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.
    Tordsson, Johan
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    How will your workload look like in 6 years?: Analyzing Wikimedia's workload2014In: Proceedings of the 2014 IEEE International Conference on Cloud Engineering (IC2E 2014) / [ed] Lisa O’Conner, IEEE Computer Society, 2014, p. 349-354Conference paper (Refereed)
    Abstract [en]

    Accurate understanding of workloads is key to efficient cloud resource management as well as to the design of large-scale applications. We analyze and model the workload of Wikipedia, one of the world's largest web sites. With descriptive statistics, time-series analysis, and polynomial splines, we study the trend and seasonality of the workload, its evolution over the years, and also investigate patterns in page popularity. Our results indicate that the workload is highly predictable with a strong seasonality. Our short term prediction algorithm is able to predict the workload with a Mean Absolute Percentage Error of around 2%.

  • 4.
    Ali-Eldin, Ahmed
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Seleznjev, Oleg
    Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.
    Sjöstedt-de Luna, Sara
    Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.
    Tordsson, Johan
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Measuring cloud workload burstiness2014In: 2014 IEEE/ACM 7th International Conference on Utility and Cloud Computing (UCC), IEEE conference proceedings, 2014, p. 566-572Conference paper (Refereed)
    Abstract [en]

    Workload burstiness and spikes are among the main reasons for service disruptions and decrease in the Quality-of-Service (QoS) of online services. They are hurdles that complicate autonomic resource management of datacenters. In this paper, we review the state-of-the-art in online identification of workload spikes and quantifying burstiness. The applicability of some of the proposed techniques is examined for Cloud systems where various workloads are co-hosted on the same platform. We discuss Sample Entropy (SampEn), a measure used in biomedical signal analysis, as a potential measure for burstiness. A modification to the original measure is introduced to make it more suitable for Cloud workloads.

  • 5.
    Ali-Eldin, Ahmed
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Tordsson, Johan
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    An adaptive hybrid elasticity controller for cloud infrastructures2012In: 2012 IEEE Network operations and managent symposium (NOMS), IEEE Communications Society, 2012, p. 204-212Conference paper (Refereed)
    Abstract [en]

    Cloud elasticity is the ability of the cloud infrastructure to rapidly change the amount of resources allocated to a service in order to meet the actual varying demands on the service while enforcing SLAs. In this paper, we focus on horizontal elasticity, the ability of the infrastructure to add or remove virtual machines allocated to a service deployed in the cloud. We model a cloud service using queuing theory. Using that model we build two adaptive proactive controllers that estimate the future load on a service. We explore the different possible scenarios for deploying a proactive elasticity controller coupled with a reactive elasticity controller in the cloud. Using simulation with workload traces from the FIFA world-cup web servers, we show that a hybrid controller that incorporates a reactive controller for scale up coupled with our proactive controllers for scale down decisions reduces SLA violations by a factor of 2 to 10 compared to a regression based controller or a completely reactive controller.

  • 6.
    Ali-Eldin, Ahmed
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Tordsson, Johan
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Kihl, Maria
    Lund University.
    WAC: A Workload analysis and classification tool for automatic selection of cloud auto-scaling methodsManuscript (preprint) (Other academic)
    Abstract [en]

    Autoscaling algorithms for elastic cloud infrastructures dynami-cally change the amount of resources allocated to a service ac-cording to the current and predicted future load. Since there areno perfect predictors, no single elasticity algorithm is suitable foraccurate predictions of all workloads. To improve the quality ofworkload predictions and increase the Quality-of-Service (QoS)guarantees of a cloud service, multiple autoscalers suitable for dif-ferent workload classes need to be used. In this work, we intro-duce WAC, a Workload Analysis and Classification tool that as-signs workloads to the most suitable elasticity autoscaler out of aset of pre-deployed autoscalers. The workload assignment is basedon the workload characteristics and a set of user-defined Business-Level-Objectives (BLO). We describe the tool design and its maincomponents. We implement WAC and evaluate its precision us-ing various workloads, BLO combinations and state-of-the-art au-toscalers. Our experiments show that, when the classifier is tunedcarefully, WAC assigns between 87% and 98.3% of the workloadsto the most suitable elasticity autoscaler.

  • 7.
    Ali-Eldin, Ahmed
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Tordsson, Johan
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Kihl, Maria
    Department of Electrical and Information Technology, Lund University, Lund, Sweden.
    Workload Classification for Efficient Auto-Scaling of Cloud Resources2013Manuscript (preprint) (Other academic)
    Abstract [en]

    Elasticity algorithms for cloud infrastructures dynamically change the amount of resources allocated to a running service according to the current and predicted future load. Since there is no perfect predictor, and since different applications’ workloads have different characteristics, no single elasticity algorithm is suitable for future predictions for all workloads. In this work, we introduceWAC, aWorkload Analysis and Classification tool that analyzes workloads and assigns them to the most suitable elasticity controllers based on the workloads’ characteristics and a set of business level objectives.

    WAC has two main components, the analyzer and the classifier. The analyzer analyzes workloads to extract some of the features used by the classifier, namely, workloads’ autocorrelations and sample entropies which measure the periodicity and the burstiness of the workloads respectively. These two features are used with the business level objectives by the clas-sifier as the features used to assign workloads to elasticity controllers. We start by analyzing 14 real workloads available from different applications. In addition, a set of 55 workloads is generated to test WAC on more workload configurations. We implement four state of the art elasticity algorithms. The controllers are the classes to which the classifier assigns workloads. We use a K nearest neighbors classifier and experiment with different workload combinations as training and test sets. Our experi-ments show that, when the classifier is tuned carefully, WAC correctly classifies between 92% and 98.3% of the workloads to the most suitable elasticity controller.

  • 8.
    Armstrong, Django
    et al.
    University of Leeds.
    Espling, Daniel
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Tordsson, Johan
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Djemame, Karim
    University of Leeds.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Contextualization: dynamic configuration of virtual machines2015In: Journal of Cloud Computing - Advances, Systems and Applications, ISSN 2192-113X, Vol. 4, no 17Article in journal (Refereed)
    Abstract [en]

    New VM instances are created from static templates that contain the basic configuration of the VM to achieve elasticity with regards to capacity. Instance specific settings can be injected into the VM during the deployment phase through means of contextualization. So far this is limited to a single data source and data remains static throughout the lifecycle of the VM.

    We present a layered approach to contextualization that supports different classes of contextualization data available from several sources. The settings are made available to the VM through virtual devices. Inside each VM data from different classes are layered on top of each other to create a unified file hierarchy.

    Context data can be modified during runtime by updating the contents of the virtual devices, making our approach the first contextualization approach to natively support recontextualization. Recontextualization enables runtime reconfiguration of an executing service and can act as a trigger and key enabler of self-* techniques. This trigger provides a service with a mechanism to adapt or optimize itself in response to a changing environment. The runtime reconfiguration using recontextualization and its potential gains are illustrated in an example with a distributed file system, demonstrating the feasibility of our approach.

  • 9.
    Armstrong, Django
    et al.
    University of Leeds.
    Espling, Daniel
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Tordsson, Johan
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Djemame, Karim
    University of Leeds.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Runtime virtual machine recontextualization for clouds2013In: Euro-Par 2012: Parallel Processing Workshops / [ed] Ioannis Caragiannis et al., Springer Berlin/Heidelberg, 2013, Vol. 7640, p. 567-576Conference paper (Refereed)
    Abstract [en]

    We introduce and define the concept of recontextualization for cloud applications by extending contextualization, i.e. the dynamic configuration of virtual machines (VM) upon initialization, with autonomous updates during runtime. Recontextualization allows VM images and instances to be dynamically re-configured without restarts or downtime, and the concept is applicable to all aspects of configuring a VM from virtual hardware to multi-tier software stacks. Moreover, we propose a runtime cloud recontextualization mechanism based on virtual device management that enables recontextualization without the need to customize the guest VM. We illustrate our concept and validate our mechanism via a use case demonstration: the reconfiguration of a cross-cloud migratable monitoring service in a dynamic cloud environment. We discuss the details of the interoperable recontextualization mechanism, its architecture and demonstrate a proof of concept implementation. A performance evaluation illustrates the feasibility of the approach and shows that the recontextualization mechanism performs adequately with an overhead of 18% of the total migration time.

  • 10.
    Bayuh Lakew, Ewnetu
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Xu, Lei
    Hernandez-Rodriguez, Francisco
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Pahl, Claus
    A Tree-based Protocol for Enforcing Quotas in Clouds2014In: 2014 IEEE WORLD CONGRESS ON SERVICES (SERVICES), 2014, p. 279-286Conference paper (Refereed)
    Abstract [en]

    Services are increasingly being hosted on cloud nodes to enhance their performance and increase their availability. The virtually unlimited availability of cloud resources enables service owners to consume resources without quantitative restrictions, paying only for what they use. To avoid cost overruns, resource consumption must be controlled and capped when necessary. We present a distributed tree-based protocol for managing quotas in clouds that minimizes communication overheads and reduces the time required to determine whether a quota has been exhausted. Experimental evaluation shows that our protocol reduces communication costs by 42% relative to a distributed baseline solution and is up to 15 times faster.

  • 11. Beco, S
    et al.
    Maraschini, A
    Pacini, F
    Biran, O
    Breitgand, O
    Meth, K
    Rochwerger, B
    Salant, E
    Silvera, E
    Tal, S
    Wolfsthal, Y
    Yehuda, M
    Caceres, J
    Hierro, J
    Emmerich, W
    Galis, A
    Edblom, Lennart
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Henriksson, Daniel
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Hernandez, Francisco
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Tordsson, Johan
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Hohl, A
    Levy, E
    Sampaio, A
    Scheuermann, B
    Wusthoff, M
    Latanicki, J
    Lopez, G
    Marin-Frisonroche, J
    Dorr, A
    Ferstl, F
    Huedo, E
    Llorente, I
    Montero, R
    Massonet, P
    Naqvi, S
    Dallons, G
    Pezz, M
    Puliafito, A
    Ragusa, C
    Scarpa, M
    Muscella, S
    Cloud Computing and RESERVOIR project2009In: Nuovo Cimento C, ISSN ISSN 1124-1896, Vol. 32, no 2, p. 99-103Article in journal (Refereed)
  • 12. Ben Yehuda, M.
    et al.
    Biran, O.
    Breitgand, D.
    Meth, K.
    Rochwerger, B.
    Salant, E.
    Silvera, E.
    Tal, S.
    Wolfsthal, Y.
    Cáceres, J.
    Hierro, J.
    Emmerich, W.
    Galis, A.
    Edblom, Lennart
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Henriksson, Daniel
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Hernández, Francisco
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Tordsson, Johan
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Hohl, A.
    Levy, E.
    Sampaio, A.
    Scheuermann, B.
    Wusthoff, M.
    Latanicki, J.
    Lopez, G.
    Marin-Frisonroche, J.
    Dörr, A.
    Ferstl, F.
    Beco, S.
    Pacini, F.
    Llorente, I.
    Montero, R.
    Huedo, E.
    Massonet, P.
    Naqvi, S.
    Dallons, G.
    Pezzé, M.
    Puliato, A.
    Ragusa, C.
    Scarpa, M.
    Muscella, S.
    RESERVOIR: An ICT Infrastructure for Reliable and Effective Delivery of Services as Utilities2008Report (Other academic)
  • 13.
    Berglund, Ann-Charlotte
    et al.
    Linnaeus Centre for Bioinformatics, Uppsala Universitet.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Hernández, Francisco
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Sandman, Björn
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Tordsson, Johan
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Combining local and grid resources in scientific workflows (for Bioinformatics)2009Conference paper (Refereed)
    Abstract [en]

    We examine some issues that arise when using both local and Gridresources in scientific workflows. Our previous work addresses and illustratesthe benefits of a light-weight and generic workflow engine that manages andoptimizes Grid resource usage. Extending on this effort, we hereillustrate how a client tool for bioinformatics applications employs the engine tointerface with Grid resources. We also explore how to define data flowsthat transparently integrates local and Grid subworkflows. In addition, the benefits of parameter sweep workflows are examined and a means for describing this type of workflows in an abstract and concise manner is introduced. Finally, the above mechanisms are employed to perform an orthology detection analysis.

  • 14.
    Dackland, Krister
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Design and performance modeling of parallel block matrix factorizations for distributed memory multicomputers1992In: Proceedings of the Industrial Mathematics Week, 1992, p. 102-116Conference paper (Refereed)
    Abstract [en]

    Efficient and scalable parallel block algorithms for the LU factorization with partial pivoting, the Cholesky, and QR factorizations in a distributed memory multicomputer environment are presented. The distributed system is viewed as a ring of processors and the algorithms correspond to shared memory algorithms parallelized on block level (explicit parallelism). Performance of the algorithms are analyzed theoretically and illustrated empirically by implementations on the Intel iPSC/2 hypercube. A model predicting performance and optimal block size is presented.

  • 15.
    Dackland, Krister
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Parallel block matrix factorizations for distributed memory multicomputers1992Report (Other academic)
  • 16.
    Dackland, Krister
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Ring-oriented block matrix factorization algorithms for shared and distributed memory architectures1992Report (Other academic)
  • 17.
    Dackland, Krister
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Kågström, Bo
    Umeå University, Faculty of Science and Technology, Department of Computing Science. Umeå University, Faculty of Science and Technology, High Performance Compting Center North (HPC2N).
    A ring-oriented approach for block matrix factorizations on shared and distributed memory architectures1993In: Proceedings of the Sixth SIAM Conference on Parallel Processing for Scientific Computing / [ed] R. F. Sincovec et al., Norfolk: SIAM Publications , 1993, p. 330-338Conference paper (Refereed)
    Abstract [en]

    A block (column) wrap-mapping approach for design of parallel block matrix factorization algorithms that are (trans)portable over and between shared memory multiprocessors (SMM) and distributed memory multicomputers (DMM) is presented. By reorganizing the matrix on the SMM architecture, the same ring-oriented algorithms can be used on both SMM and DMM systems with all machine dependencies comprised to a small set of communication routines. The algorithms are described on high level with focus on portability and scalability aspects. Implementation aspects of the LU , Cholesky, and QR factorizations and machine specific communication routines for some SMM and DMM systems are discussed. Timing results show that our portable algorithms have similar performance as machine specific implementations. 1 Introduction With the introduction of advanced parallel computer architectures a demand for efficient and portable algorithms has emerged. Several attempts to design algorithms and implementat.

  • 18.
    Dackland, Krister
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Kågström, Bo
    Umeå University, Faculty of Science and Technology, Department of Computing Science. Umeå University, Faculty of Science and Technology, High Performance Compting Center North (HPC2N).
    Van Loan, C.
    Parallel block matrix factorizations on the shared memory multiprocessor IBM 3090 VF/600J1992In: International Journal of Supercomputer Applications, ISSN 0890-2720, Vol. 6, no 1, p. 69-97Article in journal (Refereed)
  • 19.
    Dackland, Krister
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Kågström, Bo
    Umeå University, Faculty of Science and Technology, Department of Computing Science. Umeå University, Faculty of Science and Technology, High Performance Compting Center North (HPC2N).
    Van Loan, Charles
    Design and evaluation of parallel block algorithms:  LU factorization on an IBM 3090 VF/600J1992In: Proceedings of the Fifth SIAM Conference on Parallel Processing for Scientific Computing / [ed] Jack Dongarra, Ken Kennedy, Paul Messina, Danny C. Sorensen, Robert G. Voigt, Houston: SIAM Publications , 1992, p. 3-10Conference paper (Refereed)
  • 20. Durango, Jonas
    et al.
    Dellkrantz, Manfred
    Maggio, Martina
    Klein, Cristian
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Papadopoulos, Alessandro Vittorio
    Hernandez-Rodriguez, Francisco
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Arzen, Karl-Erik
    Control-theoretical load-balancing for cloud applications with brownout2014In: 2014 IEEE 53RD ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2014, p. 5320-5327Conference paper (Refereed)
    Abstract [en]

    Cloud applications are often subject to unexpected events like flash crowds and hardware failures. Without a predictable behaviour, users may abandon an unresponsive application. This problem has been partially solved on two separate fronts: first, by adding a self-adaptive feature called brownout inside cloud applications to bound response times by modulating user experience, and, second, by introducing replicas - copies of the applications having the same functionalities - for redundancy and adding a load-balancer to direct incoming traffic. However, existing load-balancing strategies interfere with brownout self-adaptivity. Load-balancers are often based on response times, that are already controlled by the self-adaptive features of the application, hence they are not a good indicator of how well a replica is performing. In this paper, we present novel load-balancing strategies, specifically designed to support brownout applications. They base their decision not on response time, but on user experience degradation. We implemented our strategies in a self-adaptive application simulator, together with some state-of-the-art solutions. Results obtained in multiple scenarios show that the proposed strategies bring significant improvements when compared to the state-of-the-art ones.

  • 21.
    Edelman, Alan
    et al.
    MIT, USA.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Kågström, Bo
    Umeå University, Faculty of Science and Technology, Department of Computing Science. Umeå University, Faculty of Science and Technology, High Performance Compting Center North (HPC2N).
    A Geometric Approach to Perturbation Theory of Matrices and Matrix Pencils. Part I: Versal Deformations1997In: SIAM Journal on Matrix Analysis and Applications, Vol. 18, no 3, p. 653-692Article in journal (Refereed)
    Abstract [en]

    We derive versal deformations of the Kronecker canonical form by deriving the tangent space and orthogonal bases for the normal space to the orbits of strictly equivalent matrix pencils. These deformations reveal the local perturbation theory of matrix pencils related to the Kronecker canonical form. We also obtain a new singular value bound for the distance to the orbits of less generic pencils. The concepts, results, and their derivations are mainly expressed in the language of numerical linear algebra. We conclude with experiments and applications.

  • 22.
    Edelman, Alan
    et al.
    MIT, USA.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Kågström, Bo
    Umeå University, Faculty of Science and Technology, Department of Computing Science. Umeå University, Faculty of Science and Technology, High Performance Compting Center North (HPC2N).
    A Geometric Approach to Perturbation Theory of Matrices and Matrix Pencils. Part II: A Stratification-Enhanced Staircase Algorithm1999In: SIAM Journal on Matrix Analysis and Applications, Vol. 20, no 3, p. 667-699Article in journal (Refereed)
    Abstract [en]

    Computing the Jordan form of a matrix or the Kronecker structure of a pencil is a well-known ill-posed problem. We propose that knowledge of the closure relations, i.e., the stratification, of the orbits and bundles of the various forms may be applied in the staircase algorithm. Here we discuss and complete the mathematical theory of these relationships and show how they may be applied to the staircase algorithm. This paper is a continuation of our Part I paper on versal deformations, but it may also be read independently.

  • 23. Edmonds, Andy
    et al.
    Metsch, Thjis
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Marshall, Jamie
    Ganschosov, Petov
    FluidCloud: An Open Framework for Relocation of Cloud Services2013In: The 5th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud'13), 2013Conference paper (Refereed)
    Abstract [en]

    Cloud computing delivers new levels of being connected, instead of the once disconnected PC-type systems. The proposal in this paper extends that level of connectedness in the cloud such that cloud service instances, hosted by providers, can relocate between clouds. This is key in order to provide economical and regulatory benefits but more importantly liberation and positive market disruption.

    While service providers want to lock in their customer’s services, FluidCloud wants the liberation of those and thereby allow the service owner to freely choose the best matching provider at any time. In the cloud world of competing cloud standards and software solutions, each only partially complete, the central research question which this paper intends to answer: How to intrinsically enable and fully automate relocation of service instances between clouds?

  • 24.
    Edmundsson, Niklas
    et al.
    Umeå University, Faculty of Science and Technology, High Performance Compting Center North (HPC2N).
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science. Umeå University, Faculty of Science and Technology, High Performance Compting Center North (HPC2N).
    Kågström, Bo
    Umeå University, Faculty of Science and Technology, Department of Computing Science. Umeå University, Faculty of Science and Technology, High Performance Compting Center North (HPC2N).
    Mårtensson, Markus
    Umeå University, Faculty of Science and Technology, High Performance Compting Center North (HPC2N).
    Nylén, Mats
    Umeå University, Faculty of Science and Technology, High Performance Compting Center North (HPC2N).
    Sandgren, Åke
    Umeå University, Faculty of Science and Technology, High Performance Compting Center North (HPC2N).
    Wadenstein, Mattias
    Umeå University, Faculty of Science and Technology, High Performance Compting Center North (HPC2N).
    Design and Evaluation of a TOP100 Linux Super Cluster System2004In: Concurrency and Computation: Practice & Experiences, ISSN 1532-0634, Vol. 16, no 8, p. 735-750Article in journal (Refereed)
    Abstract [en]

    The High Performance Computing Center North (HPC2N) Super Cluster is a truly self-made high-performance Linux cluster with 240 AMD processors in 120 dual nodes, interconnected with a high-bandwidth, low-latency SCI network. This contribution describes the hardware selected for the system, the work needed to build it, important software issues and an extensive performance analysis. The performance is evaluated using a number of state-of-the-art benchmarks and software, including STREAM, Pallas MPI, the Atlas DGEMM, High-Performance Linpack and NAS Parallel benchmarks. Using these benchmarks we first determine the raw memory bandwidth and network characteristics; the practical peak performance of a single CPU, a single dual-node and the complete 240-processor system; and investigate the parallel performance for non-optimized dusty-deck Fortran applications. In summary, this $500 000 system is extremely cost-effective and shows the performance one would expect of a large-scale supercomputing system with distributed memory architecture. According to the TOP500 list of June 2002, this cluster was the 94th fastest computer in the world. It is now fully operational and stable as the main computing facility at HPC2N. The system’s utilization figures exceed 90%, i.e. all 240 processors are on average utilized over 90% of the time, 24 hours a day, seven days a week.

  • 25.
    Elmroth, E.
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Gardfjall, P.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Tordsson, J.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Ali-Eldin, A.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    L., Larsson
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    METHOD, NODE AND COMPUTER PROGRAM FOR ENABLING AUTOMATIC ADAPTATION OF RESOURCE UNITS2015Patent (Other (popular science, discussion, etc.))
  • 26.
    Elmroth, Erik
    NERSC, Lawrence Berkeley National Laboratory, University of California, Berkeley, CA, USA.
    On Grid Partitioning for a High Performance Groundwater Simulation Software2000In: Simulation and Visualization on the Grid / [ed] B. Engquist et al., Heidelberg/Berlin, Germany: Springer , 2000, 13, p. 221-233Chapter in book (Refereed)
  • 27.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    On the stratification of the Kronecker canonical form1995Report (Other academic)
    Abstract [en]

    The understanding of which Kronecker structures that are close to a given structure is revealed by the Kronecker structure hierarchy, i.e., the stratification of the Kronecker canonical form. For a given matrix pencil A \Gamma B, the Kronecker structure hierarchy shows all structures that are within the closure of orbit(A \Gamma B), and each structure, whose orbit's closure contains A \Gamma B. In order to gain new insight in the problem of stratification, we give new interpretations of important results by Pokrzywa, for determining closure relations among orbits of Kronecker structures. This is partly done by generalizing classical theorems by Gantmacher. The results are used to derive an algorithm for computation of the complete Kronecker structure hierarchy, or the Kronecker structure hierarchy above or below a given structure. The algorithm is presented in terms of the rank-decisions required in a staircase algorithm, in order to compute the Kronecker structure hierarchy.

  • 28.
    Elmroth, Erik
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Ding, Chris
    Wu, Yu-Shu
    High Performance Computations for Large Scale Simulations of Subsurface Multiphase Fluid and Heat Flow2001In: Journal of Supercomputing, ISSN 0920-8542, E-ISSN 1573-0484, Vol. 18, no 3, p. 235-258Article in journal (Refereed)
    Abstract [en]

    TOUGH2 is a widely used reservoir simulator for solving subsurface flow related problems such as nuclear waste geologic isolation, environmental remediation of soil and groundwater contamination, and geothermal reservoir engineering. It solves a set of coupled mass and energy balance equations using a finite volume method. This contribution presents the design and analysis of a parallel version of TOUGH2. The parallel implementation first partitions the unstructured computational domain. For each time step, a set of coupled non-linear equations is solved with Newton iteration. In each Newton step, a Jacobian matrix is calculated and an ill-conditioned non-symmetric linear system is solved using a preconditioned iterative solver. Communication is required for convergence tests and data exchange across partitioning borders. Parallel performance results on Cray T3E-900 are presented for two real application problems arising in the Yucca Mountain nuclear waste site study. The execution time is reduced from 7504 seconds on two processors to 126 seconds on 128 processors for a 2D problem involving 52,752 equations. For a larger 3D problem with 293,928 equations the time decreases from 10,055 seconds on 16 processors to 329 seconds on 512 processors.

  • 29.
    Elmroth, Erik
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Ding, Chris
    Wu, Yu-Shu
    Pruess, Karsten
    A parallel implementation of the TOUGH2 software package for large scale multiphase fluid and heat flow simulations1999In: A Parallel Implementation of the TOUGH2 Software Package for Large Scale Multiphase Fluid and Heat Flow Simulations: Proceedings of the 1999 ACM/IEEE conference on Supercomputing, ACM/IEEE , 1999, p. 52-52Conference paper (Refereed)
    Abstract [en]

    TOUGH2 is a widely used simulation package for solving groundwater flow related problems such as nuclear waste isolation, environmental remediation, and geothermal reservoir engineering. It solves a set of coupled mass and energy balance equations using a finite volume method. The parallel implementation first partitions the unstructured computational domain. For each time step, a set of coupled non-linear equations is solved with Newton iteration. In each Newton step, a Jacobian matrix is calculated and an ill-conditioned non-symmetric linear system is solved in parallel using a preconditioned iterative solver. Communication is required for convergence tests and data exchange across partitioning borders. A real problem with 17,584 blocks and 43,815 connections indicates good scalability properties. From 2 to 128 processors on Cray T3E, the solution time is reduced from 7984 to 126 seconds. Improved parallel performance is expected for larger problems with 105-106 blocks in a Yucca Mountain nuclear waste site study.

  • 30.
    Elmroth, Erik
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Factor, Michael
    Miller, Ethan
    Seltzer, Margo
    Is the Future of Preservation Cloudy?2013Report (Other academic)
    Abstract [en]

    This report documents the program and the outcomes of Dagstuhl Seminar 12472 “Is the Futureof Preservation Cloudy?”. Our seminar was composed of a series of panels structured as aseries of brief presentations followed by an open discussion. The seminar started with a sessionintroducing key concepts and definitions and illuminating the vast array of perspectives fromwhich attendees were addressing issues of cloud and preservation. We them proceeded into adiscussion of requirements from different types of communities and a subsequent discussion on howto protect the data and ensure its integrity and reliability. We next considered issues related tocloud infrastructure, in particular related to management of the bits and logical obsolescence. Wealso considered the economics of preservation and the ability to reuse knowledge. In addition tothese pre-planned panels, we had three breakout sessions that were identified by the participants:automated appraisal, design for forgetting, and PaaS/SaaS for data preservation. After theexecutive summary, we present summaries of the panels and reports on the breakout sessions,followed by brief abstracts from a majority of the seminar participants describing the materialthey presented in the panels.

  • 31.
    Elmroth, Erik
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science. Umeå University, Faculty of Science and Technology, High Performance Computing Center North (HPC2N).
    Gardfjäll, Peter
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Design and Evaluation of a Decentralized System for Grid-wide Fairshare Scheduling2005In: Proceedings of the First International Conference on e-Science and Grid Computing (e-Science’05), USA, Los Alamitos: IEEE Computer Society Press , 2005, p. 221-229Conference paper (Refereed)
    Abstract [en]

    This contribution presents a decentralized architecture for a grid-wide fairshare scheduling system and demonstrates its potential in a simulated environment. The system, which preserves local site autonomy, enforces locally and globally scoped share policies, allowing local resource capacity as well as global grid capacity to be logically divided across different groups of users. The policy model is hierarchical and subpolicy definition can be delegated so that, e.g., a VO that has been granted a resource share can partition its share across its projects, which in turn can divide their shares between project members. There is no need for a central coordinator as policies are enforced collectively by the resource schedulers. Each local scheduler adopts a grid-wide view on utilization in order to steer local resource utilization to not only maintain local resource shares but also to contribute to maintaining global shares across the entire set of grid resources. Share enforcement is addressed by an algorithm that calculates simple priority values, thus simplifying integration with local schedulers, which can remain unaware of the hierarchical share policy structure

  • 32.
    Elmroth, Erik
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science. Umeå University, Faculty of Science and Technology, High Performance Computing Center North (HPC2N).
    Gardfjäll, Peter
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Mulmo, Olle
    Sandholm, Thomas
    An OGSA-based Bank Service for Grid Accounting Systems2006In: State-of-the-art in Scientific Computing, Springer-Verlag , 2006, p. 1051-1060Conference paper (Refereed)
    Abstract [en]

    This contribution presents the design and implementation of a bank service, constituting a key component in a recently developed Grid accounting system. The Grid accounting system maintains a Grid-wide view of the resources consumed by members of a virtual organization (VO). The bank is designed as an online service, managing the accounts of VO projects. Each service request is transparently intercepted by the accounting system, which acquires a reservation on a portion of the project’s bank account prior to servicing the request. Upon service completion, the account is charged for the consumed resources. We present the overall bank design and technical details of its major components, as well as some illustrative examples of relevant service interactions. The system, which has been implemented using the Globus Toolkit, is based on state-of-the-art Web and Grid services technology and complies with the Open Grid Services Architecture (OGSA).

  • 33.
    Elmroth, Erik
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Gardfjäll, Peter
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Mulmo, Olle
    Sandholm, Thomas
    An OGSA-based bank service for grid accounting systems2006In: Applied parallel computing: state-of-the-art in scientific computing, Springer-Verlag , 2006, p. 1051-1060Chapter in book (Other academic)
  • 34.
    Elmroth, Erik
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Gardfjäll, Peter
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Mulmo, Olle
    Sandholm, Thomas
    Sandgren, Åke
    Umeå University, Faculty of Science and Technology, High Performance Computing Center North (HPC2N).
    A coordinated accounting solution for SweGrid2003Report (Other (popular science, discussion, etc.))
  • 35.
    Elmroth, Erik
    et al.
    Umeå University, Faculty of Science and Technology, High Performance Computing Center North (HPC2N). Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Gardfjäll, Peter
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Norberg, Arvid
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Tordsson, Johan
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Östberg, Per-Olov
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Designing general, composable, and middleware-independent Grid infrastructure tools for multi-tiered job management2007In: Towards Next Generation Grids / [ed] T. Priol and M. Vaneschi, Springer-Verlag , 2007, p. 175-184Conference paper (Refereed)
    Abstract [en]

    We propose a multi-tiered architecture for middleware-independent Grid job management. The architecture consists of a number of services for well-defined tasks in the job management process, offering complete user-level isolation of servicecapabilities, multiple layers of abstraction, control, and fault tolerance. The middleware abstraction layer comprises components for targeted job submission, job control and resource discovery. The brokered job submission layer offers a Grid view on resources, including functionality for resource brokering and submission of jobs to selected resources. The reliable job submission layer includes components for fault tolerant execution of individual jobs and groups of independentjobs, respectively. The architecture is proposed as a composable set of tools rather than a monolithic solution, allowing users to select the individual components of interest. The prototype presented is implemented using the Globus Toolkit 4, integrated with the Globus Toolkit 4 and NorduGrid/ARC middlewares and based on existing and emerging Grid standards. A performance evaluation reveals that the overhead for resource discovery, brokering, middleware-specific format conversions, job monitoring, fault tolerance, and management of individual and groups of jobs is sufficiently small to motivate the use of the framework.

  • 36.
    Elmroth, Erik
    et al.
    Umeå University, Faculty of Science and Technology, High Performance Computing Center North (HPC2N). Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Gardfjäll, Peter
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Tordsson, Johan
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    An Advanced Grid Computing Course for Application and Infrastructure Developers2005In: 2005 IEEE International Symposium on Cluster Computing and the Grid, USA: IEEE Computer Society Press , 2005, p. 43-50Conference paper (Refereed)
    Abstract [en]

    This contribution presents our experiences from developing an advanced course in grid computing, aimed at application and infrastructure developers. The course was intended for computer science students with extensive programming experience and previous knowledge of distributed systems, parallel computing, computer networking, and security. The presentation includes brief presentations of all topics covered in the course, a list of the literature used, and descriptions of the mandatory computer assignments performed using Globus Toolkit 2 and 3. A summary of our experiences from the course and some suggestions for future directions concludes the presentation.

  • 37.
    Elmroth, Erik
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Gustavson, F. G.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Applying recursion to serial and parallel QR factorization leads to better performance2000In: IBM Journal of Research and Development, ISSN 0018-8646, E-ISSN 2151-8556, Vol. 44, no 4, p. 605-624Article in journal (Refereed)
    Abstract [en]

    We present new recursive serial and parallel algorithms for QR factorization of an m by n matrix. They improve performance. The recursion leads to an automatic variable blocking, and it also replaces a Level 2 part in a standard block algorithm with Level 3 operations. However, there are significant additional costs for creating and performing the updates, which prohibit the efficient use of the recursion for large n. We present a quantitative analysis of these extra costs. This analysis leads us to introduce a hybrid recursive algorithm that outperforms the LAPACK algorithm DGEQRF by about 20% for large square matrices and up to almost a factor of 3 for tall thin matrices. Uniprocessor performance results are presented for two IBM RS/6000(R) SP nodes-a 120-MHz IBM POWER2 node and one processor of a four-way 332-MHz IBM PowerPC(R) 604e SMP node. The hybrid recursive algorithm reaches more than 90% of the theoretical peak performance of the POWER2 node, Compared to standard block algorithms, the recursive approach also shows a significant advantage in the automatic tuning obtained from its automatic variable blocking. A successful parallel implementation on a four-way 332-MHz IBM PPC604e SMP node based on dynamic load balancing is presented. For two, three, and four processors it shows speedups of up to 1.97, 2.99, and 3.97.

  • 38.
    Elmroth, Erik
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Gustavson, Fred
    A Faster and Simpler Recursive Algorithm for the LAPACK Routine DGELS2001In: BIT Numerical Mathematics, ISSN 0006-3835, E-ISSN 1572-9125, Vol. 41, no 5, p. 936-949Article in journal (Refereed)
  • 39.
    Elmroth, Erik
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Gustavson, Fred
    High-performance library software for QR factorization2001In: Applied Parallel Computing: New Paradigms for HPC in Industry and Academia. 5th International Workshop, PARA 2000 Bergen, Norway, June 18–20, 2000 Proceedings / [ed] Tor Sørevik, Fredrik Manne, Assefaw Hadish Gebremedhin, Randi Moe, Heidelberg/Berlin, Germany: Springer , 2001, Vol. 1947, p. 53-63Conference paper (Other academic)
    Abstract [en]

    In [5],[6], we presented algorithm RGEQR3, a purely recursive formulation of the QR factorization. Using recursion leads us to a natural way to choose the k-way aggregating Householder transform of Schreiber and Van Loan [10]. RGEQR3 is a performance critical subroutine for the main (hybrid recursive) routine RGEQRF for QR factorization of a general m×n matrix. This contribution presents a new version of RGEQRF and its accompanying SMP parallel counterpart, implemented for a future release of the IBM ESSL library. It represents a robust high-performance piece of library software for QR factorization on uniprocessor and multiprocessor systems. The implementation builds on previous results [5],[6]. In particular, the new version is optimized in a number of ways to improve the performance; e.g., for small matrices and matrices with a very small number of columns. This is partly done by including mini blocking in the otherwise pure recursive RGEQR3. We describe the salient features of this implementation. Our serial implementation outperforms the corresponding LAPACK routine by 10-65% for square matrices and 10-100% on tall and thin matrices on the IBM POWER2 and POWER3 nodes. The tests covered matrix sizes which varied from very small to very large. The SMP parallel implementation shows close to perfect speedup on a 4-processor PPC604e node.

  • 40.
    Elmroth, Erik
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Gustavson, Fred G
    New serial and parallel recursive QR factorization algorithms for SMP systems1998In:  Applied parallel computing: large scale scientific and industrial problems: 4th international workshop, PARA '98, Umeå, Sweden, June 14-17, 1998 : proceedings / [ed] Bo Kågström, Jack Dongarra, Erik Elmroth, Jerzy Wasniewski, Heidelberg/Berlin, Germany: Springer , 1998, Vol. 1541, p. 120-128Conference paper (Other academic)
    Abstract [en]

    We present a new recursive algorithm for the QR factorization of an m by n matrix A. The recursion leads to an automatic variable blocking that allow us to replace a level 2 part in a standard block algorithm by level 3 operations. However, there are some additional costs for performing the updates which prohibits the efficient use of the recursion for large n. This obstacle is overcome by using a hybrid recursive algorithm that outperforms the LAPACK algorithm DGEQRF by 78% to 21% as m=n increases from 100 to 1000. A successful parallel implementation on a PowerPC 604 based IBM SMP node based on dynamic load balancing is presented. For 2, 3, 4 processors and m=n=2000 it shows speedups of 1.96, 2.99, and 3.92 compared to our uniprocessor algorithm.

  • 41.
    Elmroth, Erik
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Gustavson, Fred
    Umeå University, Faculty of Science and Technology, Department of Computing Science. Umeå University, Faculty of Science and Technology, High Performance Compting Center North (HPC2N).
    Jonsson, Isak
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Kågström, Bo
    Umeå University, Faculty of Science and Technology, Department of Computing Science. Umeå University, Faculty of Science and Technology, High Performance Compting Center North (HPC2N).
    Recursive Blocked Algorithms and Hybrid Data Structures for Dense Matrix Library Software2004In: SIAM Review, Vol. 46, no 1, p. 3-45Article in journal (Refereed)
    Abstract [en]

    Matrix computations are both fundamental and ubiquitous in computational science and its vast application areas. Along with the development of more advanced computer systems with complex memory hierarchies, there is a continuing demand for new algorithms and library software that efficiently utilize and adapt to new architecture features. This article reviews and details some of the recent advances made by applying the paradigm of recursion to dense matrix computations on today's memory-tiered computer systems. Recursion allows for efficient utilization of a memory hierarchy and generalizes existing fixed blocking by introducing automatic variable blocking that has the potential of matching every level of a deep memory hierarchy. Novel recursive blocked algorithms offer new ways to compute factorizations such as Cholesky and QR and to solve matrix equations. In fact, the whole gamut of existing dense linear algebra factorization is beginning to be reexamined in view of the recursive paradigm. Use of recursion has led to using new hybrid data structures and optimized superscalar kernels. The results we survey include new algorithms and library software implementations for level 3 kernels, matrix factorizations, and the solution of general systems of linear equations and several common matrix equations. The software implementations we survey are robust and show impressive performance on today's high performance computing systems.

  • 42.
    Elmroth, Erik
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Henriksson, Daniel
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Distributed usage logging for federated grids2010In: Future generations computer systems, ISSN 0167-739X, E-ISSN 1872-7115, Vol. 26, no 8, p. 1215-1225Article in journal (Refereed)
    Abstract [en]

    We present a non-intrusive solution to the increasingly important problem of shared logging for overlapping and federated Grid environments. The solution addresses three usage scenarios of hierarchical Grids, mutual cross-Grid resource utilization, and federated Cloud computing infrastructures. The approach is evaluated by extending the existing SweGrid Accounting System (SGAS) with a light-weight component that makes the system applicable to a wide range of usage scenarios. The proposed architecture is characterized by its simplicity, flexibility, and generality, and the new key component by its non-intrusiveness, flexibility, and ability to manage high load. We present requirements derived from three usage scenarios, and also include an in-depth description of the architecture and design, as well as the implementation and performance evaluation of a new component written for use with SGAS. We conclude from a performance evaluation that the sharing of usage data is not likely to be a limiting performance factor even in large-scale Grid scenarios.

  • 43.
    Elmroth, Erik
    et al.
    Umeå University, Faculty of Science and Technology, High Performance Compting Center North (HPC2N). Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Hernández, Francisco
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Tordsson, Johan
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    A light-weight Grid workflow execution service enabling client and middleware independence2008In: Parallel Processing and Applied Mathematics: 7th International Conference on Parallel Processing and Applied Mathematics (PPAM 2007), Springer-Verlag , 2008, p. 754-761Conference paper (Refereed)
    Abstract [en]

    We present a generic and light-weight Grid workflow execution engine made available as a Grid service. A long-term goal is to facilitate the rapid development of application-oriented end-user workflow tools, while providing a high degree of Grid middleware-independence. The workflow engine is designed for workflow execution, independent of client tools for workflow definition. A flexible plugin-structure for middleware-integration provides a strict separation of the workflow execution and the processing of individual tasks, such as computational jobs or file transfers. The light-weight design is achieved by focusing on the generic workflow execution components and by leveraging state-of-the art Grid technology, e.g., for state management. The current prototype is implemented using the Globus Toolkit 4 (GT4) Java WS Core and has support for executing workflows produced by Karajan. It also includes plugins for task execution with GT4 as well as a high-level Grid job management framework.

  • 44.
    Elmroth, Erik
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Hernández, Francisco
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Tordsson, Johan
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Three fundamental dimensions of scientific workflow interoperability: model of computation, language, and execution environment2010In: Future generations computer systems, ISSN 0167-739X, E-ISSN 1872-7115, Vol. 26, no 2, p. 245-256Article in journal (Refereed)
    Abstract [en]

    We investigate interoperability aspects of scientific workflow systems and argue that the workflow execution environment, the model of computation (MoC), and the workflow language form three dimensions that must be considered depending on the type of interoperability sought: at the activity, sub-workflow, or workflow levels. With a focus on the problems that affect interoperability, we illustrate how these issues are tackled by current scientific workflows as well as how similar problems have been addressed in related areas. Our long-term objective is to achieve (logical) interoperability between workflow systems operating under different MoCs, using distinct language features, and sharing activities running on different execution environments.

  • 45.
    Elmroth, Erik
    et al.
    Umeå University, Faculty of Science and Technology, High Performance Compting Center North (HPC2N). Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Hernández, Francisco
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Tordsson, Johan
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Östberg, Per-Olov
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Designing service-based resource management tools for a healthy grid ecosystem2008In: Parallel processing and applied mathematics: 7th International Conference on Parallel Processing and Applied Mathematics, Springer-Verlag , 2008, p. 259-270Conference paper (Refereed)
    Abstract [en]

    We present an approach for development of Grid resource management tools, where we put into practice internationally established high-level views of future Grid architectures. The approach addresses fundamental Grid challenges and strives towards a future vision of the Grid where capabilities are made available as independent and dynamically assembled utilities, enabling run-time changes in the structure, behavior, and location of software. The presentation is made in terms of design heuristics, design patterns, and quality attributes, and is centered around the key concepts of co-existence, composability, adoptability, adaptability, changeability, and interoperability. The practical realization of the approach is illustrated by five case studies (recently developed Grid tools) high-lighting the most distinct aspects of these key concepts for each tool. The approach contributes to a healthy Grid ecosystem that promotes a natural selection of “surviving” components through competition, innovation, evolution, and diversity. In conclusion, this environment facilitates the use and composition of components on a per-component basis.

  • 46.
    Elmroth, Erik
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Holmgren, S.
    Lindemann, J.
    Toor, S.
    Östberg, Per-Olov
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Empowering a flexible application portal with a SOA-based Grid job management framework2009In: Applied Parallel Computing (PARA 08): State of art in scientific computing / [ed] A.C. Elster et al., Springer , 2009Conference paper (Refereed)
  • 47.
    Elmroth, Erik
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science. Umeå University, Faculty of Science and Technology, High Performance Compting Center North (HPC2N).
    Johansson, Pedher
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Johansson, Stefan
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Kågström, Bo
    Umeå University, Faculty of Science and Technology, Department of Computing Science. Umeå University, Faculty of Science and Technology, High Performance Compting Center North (HPC2N).
    Orbit and Bundle Stratification for Controllability and Observability Matrix Pairs in StratiGraph2004In: Proceedings of the 16th International Symposium on Mathematical Theory of Networks and Systems (MTNS), 2004, p. 1-9Conference paper (Refereed)
    Abstract [en]

    The canonical structures of controllability and observability pairs (A,B) and (A,C) associated with a state-space system are studied under small perturbations. We show how previous work for general matrix pencils can be applied to the stratification of orbits and bundles of matrix pairs. A stratification provides qualitative information about the closure relation between canonical structures.We also present how the new results are used in StratiGraph, which is a software tool for computing and visualizing closure hierarchies.

  • 48.
    Elmroth, Erik
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Johansson, Pedher
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Johansson, Stefan
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Kågström, Bo
    Umeå University, Faculty of Science and Technology, Department of Computing Science. Umeå University, Faculty of Science and Technology, High Performance Compting Center North (HPC2N).
    Orbit and bundle stratification of controllability and observability matrix pairs in StratiGraph2004In: Proceedings MTNS 2004Article in journal (Refereed)
  • 49.
    Elmroth, Erik
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Johansson, Pedher
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Kreßner, Daniel
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Kågström, Bo
    Umeå University, Faculty of Science and Technology, Department of Computing Science. Umeå University, Faculty of Science and Technology, High Performance Compting Center North (HPC2N).
    A Web Computing Environment for the SLICOT Library2001In: The Third NICONET Workshop on Numerical Control Software, p. 53-61Article in journal (Refereed)
    Abstract [en]

    A prototype web computing environment for computations related to the design and analysis of control systems using the SLICOT software library is presented. The web interface can be accessed from a standard world wide web browser with no need for additional software installations on the local machine. The environment provides user-friendly access to SLICOT routines where run-time options are specified by mouse clicks on appropriate buttons. Input data can be entered directly into the web interface by the user or uploaded from a local computer in a standard text format or in Matlab binary format. Output data is presented in the web browser window and possible to download in a number of different formats, including Matlab binary. The environment is ideal for testing the SLICOT software before performing a software installation or for performing a limited number of computations. It is also highly recommended for education as it is easy to use, and basically self-explanatory, with the users' guide integrated in the user interface.

  • 50.
    Elmroth, Erik
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Johansson, Pedher
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Kågström, Bo
    Umeå University, Faculty of Science and Technology, Department of Computing Science. Umeå University, Faculty of Science and Technology, High Performance Compting Center North (HPC2N).
    Bounds for the distance between nearby Jordan and Kronecker structures in a closure hierarchy2003In: Journal of Mathematical Science, ISSN 1072-3374, Vol. 114, no 6, p. 1765-1779Article in journal (Refereed)
    Abstract [en]

    Computing the fine-canonical-structure elements of matrices and matrix pencils are ill-posed problems. Therefore, besides knowing the canonical structure of a matrix or a matrix pencil, it is equally important to know what are the nearby canonical structures that explain the behavior under small perturbations. Qualitative strata information is provided by our StratiGraph tool. Here, we present lower and upper bounds for the distance between Jordan and Kronecker structures in a closure hierarchy of an orbit or bundle stratification. This quantitative information is of importance in applications, e.g., distance to more degenerate systems (uncontrollability). Our upper bounds are based on staircase regularizing perturbations. The lower bounds are of EckartYoung type and are derived from a matrix representation of the tangent space of the orbit of a matrix or a matrix pencil. Computational results illustrate the use of the bounds.

1234 1 - 50 of 169
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf