To main content

Predictive Data Transformation Suggestions in Grafterizer Using Machine Learning

Abstract

Data preprocessing is a crucial step in data analysis. A substantial amount of time is spent on data transformation tasks such as data formatting, modification, extraction, and enrichment, typically making it more convenient for users to work with systems that can recommend most relevant transformations for a given dataset. In this paper, we propose an approach for generating relevant data transformation suggestions for tabular data preprocessing using machine learning (specifically, the Random Forest algorithm). The approach is implemented for Grafterizer, a Web-based framework for tabular data cleaning and transformation, and evaluated through a usability study.

Category

Academic article

Language

English

Author(s)

  • Salhia Sajid
  • Bjørn Marius von Zernichow
  • Ahmet Soylu
  • Titi Roman

Affiliation

  • SINTEF Digital / Sustainable Communication Technologies
  • University of Oslo

Year

2019

Published in

Communications in Computer and Information Science (CCIS)

ISSN

1865-0929

Volume

1057 CCIS

Page(s)

137 - 149

View this publication at Norwegian Research Information Repository