To main content

A general-purpose distributed pattern mining system

Abstract

This paper explores five pattern mining problems and proposes a new distributed framework called DT-DPM: Decomposition Transaction for Distributed Pattern Mining. DT-DPM addresses the limitations of the existing pattern mining problems by reducing the enumeration search space. Thus, it derives the relevant patterns by studying the different correlation among the transactions. It first decomposes the set of transactions into several clusters of different sizes, and then explores heterogeneous architectures, including MapReduce, single CPU, and multi CPU, based on the densities of each subset of transactions. To evaluate the DT-DPM framework, extensive experiments were carried out by solving five pattern mining problems (FIM: Frequent Itemset Mining, WIM: Weighted Itemset Mining, UIM: Uncertain Itemset Mining, HUIM: High Utility Itemset Mining, and SPM: Sequential Pattern Mining). Experimental results reveal that by using DT-DPM, the scalability of the pattern mining algorithms was improved on large databases. Results also reveal that DT-DPM outperforms the baseline parallel pattern mining algorithms on big databases.
Read the publication

Category

Academic article

Language

English

Author(s)

  • Asma Belhadi
  • Youcef Djenouri
  • Jerry Chun-Wei Lin
  • Alberto Cano

Affiliation

  • SINTEF Digital
  • Norwegian University of Science and Technology
  • Western Norway University of Applied Sciences
  • University of Science and Technology 'Houari Boumediene' Algiers
  • Virginia Commonwealth University

Year

2020

Published in

Applied intelligence (Boston)

ISSN

0924-669X

Volume

50

Page(s)

2647 - 2662

View this publication at Norwegian Research Information Repository