To main content

TranSQL: A Transformer-based Model for Classifying SQL Queries

Abstract

Domain-Specific Languages (DSL) are becoming popular in various fields as they enable domain experts to focus on domain-specific concepts rather than software-specific ones. Many domain experts usually reuse their previously-written scripts for writing new ones; however, to make this process straightforward, there is a need for techniques that can enable domain experts to find existing relevant scripts easily. One fundamental component of such a technique is a model for identifying similar DSL scripts. Nevertheless, the inherent nature of DSLs and lack of data makes building such a model challenging. Hence, in this work, we propose TRANSQL, a transformer-based model for classifying DSL scripts based on their similarities, considering their few-shot context. We build TRANSQL using BERT and GPT-3, two performant language models. Our experiments focus on SQL as one of the most commonly-used DSLs. The experiment results reveal that the BERT-based TRANSQL cannot perform well for DSLs since they need extensive data for the fine-tuning phase. However, the GPT-based TRANSQL gives markedly better and more promising results.
Read the publication

Category

Academic chapter

Language

English

Author(s)

  • Shirin Tahmasebi
  • Amir Hossein Payberah
  • Ahmet Soylu
  • Titi Roman
  • Mihhail Matskin

Affiliation

  • SINTEF Digital / Sustainable Communication Technologies
  • Royal Institute of Technology
  • OsloMet - Oslo Metropolitan University

Year

2022

Publisher

IEEE (Institute of Electrical and Electronics Engineers)

Book

Proceedings 21st IEEE international conference on machine learning and applications : ICMLA 2022

ISBN

9781665462839

Page(s)

788 - 793

View this publication at Norwegian Research Information Repository