Til hovedinnhold
Norsk English

Scalable Execution of Big Data Workflows using Software Containers

Sammendrag

Big Data processing involves handling large and complex data sets, incorporating different tools and frameworks as well as other processes that help organisations make sense of their data collected from various sources. This set of operations, referred to as Big Data workflows, require taking advantage of the elasticity of cloud infrastructures for scalability. In this paper, we present the design and prototype implementation of a Big Data workflow approach based on the use of software container technologies and message-oriented middleware (MOM) to enable highly scalable workflow execution. The approach is demonstrated in a use case together with a set of experiments that demonstrate the practical applicability of the proposed approach for the scalable execution of Big Data workflows. Furthermore, we present a scalability comparison of our proposed approach with that of Argo Workflows - one of the most prominent tools in the area of Big Data workflows.

Kategori

Vitenskapelig kapittel

Språk

Engelsk

Forfatter(e)

Institusjon(er)

  • SINTEF Digital / Sustainable Communication Technologies
  • Kungliga Tekniska högskolan

År

2020

Forlag

ACM Publications

Bok

MEDES '20: Proceedings of the 12th International Conference on Management of Digital EcoSystems

ISBN

9781450381154

Side(r)

76 - 83

Vis denne publikasjonen hos Nasjonalt Vitenarkiv