Code Generation by Example Using Symbolic Machine Learning

Kevin Lano; Qiaomu Xue

doi:10.1007/s42979-022-01573-4

Code Generation by Example Using Symbolic Machine Learning

Kevin Lano, Qiaomu Xue

Research output: Contribution to journal › Article › peer-review

8 Citations (Scopus)

Abstract

Code generation is a key technique for model-driven engineering (MDE) approaches of software construction. Code generation enables the synthesis of applications in executable programming languages from high-level specifications in UML or in a domain-specific language. Specialised code generation languages and tools have been defined; however, the task of manually constructing a code generator remains a substantial undertaking, requiring a high degree of expertise in both the source and target languages, and in the code generation language. In this paper, we apply novel symbolic machine learning techniques for learning tree-to-tree mappings of software syntax trees, to automate the development of code generators from source–target example pairs. We evaluate the approach on several code generation tasks, and compare the approach to other code generator construction approaches. The results show that the approach can effectively automate the synthesis of code generators from examples, with relatively small manual effort required compared to existing code generation construction approaches. We also identified that it can be adapted to learn software abstraction and translation algorithms. The paper demonstrates that a symbolic machine learning approach can be applied to assist in the development of code generators and other tools manipulating software syntax trees.

Original language	English
Article number	170
Pages (from-to)	1-23
Journal	SN Computer Science
Volume	4
Issue number	2
Early online date	17 Jan 2023
DOIs	https://doi.org/10.1007/s42979-022-01573-4
Publication status	Published - 17 Jan 2023

Keywords

Code generation
model-driven engineering
symbolic machine learning

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1007/s42979-022-01573-4Licence: CC BY

Cite this

@article{66b5df8b50ef4761b997398bff5a5c22,

title = "Code Generation by Example Using Symbolic Machine Learning",

abstract = "Code generation is a key technique for model-driven engineering (MDE) approaches of software construction. Code generation enables the synthesis of applications in executable programming languages from high-level specifications in UML or in a domain-specific language. Specialised code generation languages and tools have been defined; however, the task of manually constructing a code generator remains a substantial undertaking, requiring a high degree of expertise in both the source and target languages, and in the code generation language. In this paper, we apply novel symbolic machine learning techniques for learning tree-to-tree mappings of software syntax trees, to automate the development of code generators from source–target example pairs. We evaluate the approach on several code generation tasks, and compare the approach to other code generator construction approaches. The results show that the approach can effectively automate the synthesis of code generators from examples, with relatively small manual effort required compared to existing code generation construction approaches. We also identified that it can be adapted to learn software abstraction and translation algorithms. The paper demonstrates that a symbolic machine learning approach can be applied to assist in the development of code generators and other tools manipulating software syntax trees.",

keywords = "Code generation, model-driven engineering, symbolic machine learning",

author = "Kevin Lano and Qiaomu Xue",

note = "Funding Information: Q. Xue was funded by the King{\textquoteright}s China Scholarship award. K. Lano declares that he has no conflict of interest. Q. Xue declares that she has no conflict of interest. This article does not contain any studies with human participants or animals performed by any of the authors. Publisher Copyright: {\textcopyright} 2023, The Author(s).",

year = "2023",

month = jan,

day = "17",

doi = "10.1007/s42979-022-01573-4",

language = "English",

volume = "4",

pages = "1--23",

journal = "SN Computer Science",

issn = "2661-8907",

publisher = "Springer",

number = "2",

}

TY - JOUR

T1 - Code Generation by Example Using Symbolic Machine Learning

AU - Lano, Kevin

AU - Xue, Qiaomu

N1 - Funding Information: Q. Xue was funded by the King’s China Scholarship award. K. Lano declares that he has no conflict of interest. Q. Xue declares that she has no conflict of interest. This article does not contain any studies with human participants or animals performed by any of the authors. Publisher Copyright: © 2023, The Author(s).

PY - 2023/1/17

Y1 - 2023/1/17

N2 - Code generation is a key technique for model-driven engineering (MDE) approaches of software construction. Code generation enables the synthesis of applications in executable programming languages from high-level specifications in UML or in a domain-specific language. Specialised code generation languages and tools have been defined; however, the task of manually constructing a code generator remains a substantial undertaking, requiring a high degree of expertise in both the source and target languages, and in the code generation language. In this paper, we apply novel symbolic machine learning techniques for learning tree-to-tree mappings of software syntax trees, to automate the development of code generators from source–target example pairs. We evaluate the approach on several code generation tasks, and compare the approach to other code generator construction approaches. The results show that the approach can effectively automate the synthesis of code generators from examples, with relatively small manual effort required compared to existing code generation construction approaches. We also identified that it can be adapted to learn software abstraction and translation algorithms. The paper demonstrates that a symbolic machine learning approach can be applied to assist in the development of code generators and other tools manipulating software syntax trees.

AB - Code generation is a key technique for model-driven engineering (MDE) approaches of software construction. Code generation enables the synthesis of applications in executable programming languages from high-level specifications in UML or in a domain-specific language. Specialised code generation languages and tools have been defined; however, the task of manually constructing a code generator remains a substantial undertaking, requiring a high degree of expertise in both the source and target languages, and in the code generation language. In this paper, we apply novel symbolic machine learning techniques for learning tree-to-tree mappings of software syntax trees, to automate the development of code generators from source–target example pairs. We evaluate the approach on several code generation tasks, and compare the approach to other code generator construction approaches. The results show that the approach can effectively automate the synthesis of code generators from examples, with relatively small manual effort required compared to existing code generation construction approaches. We also identified that it can be adapted to learn software abstraction and translation algorithms. The paper demonstrates that a symbolic machine learning approach can be applied to assist in the development of code generators and other tools manipulating software syntax trees.

KW - Code generation

KW - model-driven engineering

KW - symbolic machine learning

UR - http://www.scopus.com/inward/record.url?scp=85146461866&partnerID=8YFLogxK

U2 - 10.1007/s42979-022-01573-4

DO - 10.1007/s42979-022-01573-4

M3 - Article

SN - 2661-8907

VL - 4

SP - 1

EP - 23

JO - SN Computer Science

JF - SN Computer Science

IS - 2

M1 - 170

ER -

Code Generation by Example Using Symbolic Machine Learning

Abstract

Keywords

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this