TY - JOUR
T1 - Code Generation by Example Using Symbolic Machine Learning
AU - Lano, Kevin
AU - Xue, Qiaomu
N1 - Funding Information:
Q. Xue was funded by the King’s China Scholarship award. K. Lano declares that he has no conflict of interest. Q. Xue declares that she has no conflict of interest. This article does not contain any studies with human participants or animals performed by any of the authors.
Publisher Copyright:
© 2023, The Author(s).
PY - 2023/1/17
Y1 - 2023/1/17
N2 - Code generation is a key technique for model-driven engineering (MDE) approaches of software construction. Code generation enables the synthesis of applications in executable programming languages from high-level specifications in UML or in a domain-specific language. Specialised code generation languages and tools have been defined; however, the task of manually constructing a code generator remains a substantial undertaking, requiring a high degree of expertise in both the source and target languages, and in the code generation language. In this paper, we apply novel symbolic machine learning techniques for learning tree-to-tree mappings of software syntax trees, to automate the development of code generators from source–target example pairs. We evaluate the approach on several code generation tasks, and compare the approach to other code generator construction approaches. The results show that the approach can effectively automate the synthesis of code generators from examples, with relatively small manual effort required compared to existing code generation construction approaches. We also identified that it can be adapted to learn software abstraction and translation algorithms. The paper demonstrates that a symbolic machine learning approach can be applied to assist in the development of code generators and other tools manipulating software syntax trees.
AB - Code generation is a key technique for model-driven engineering (MDE) approaches of software construction. Code generation enables the synthesis of applications in executable programming languages from high-level specifications in UML or in a domain-specific language. Specialised code generation languages and tools have been defined; however, the task of manually constructing a code generator remains a substantial undertaking, requiring a high degree of expertise in both the source and target languages, and in the code generation language. In this paper, we apply novel symbolic machine learning techniques for learning tree-to-tree mappings of software syntax trees, to automate the development of code generators from source–target example pairs. We evaluate the approach on several code generation tasks, and compare the approach to other code generator construction approaches. The results show that the approach can effectively automate the synthesis of code generators from examples, with relatively small manual effort required compared to existing code generation construction approaches. We also identified that it can be adapted to learn software abstraction and translation algorithms. The paper demonstrates that a symbolic machine learning approach can be applied to assist in the development of code generators and other tools manipulating software syntax trees.
KW - Code generation
KW - model-driven engineering
KW - symbolic machine learning
UR - http://www.scopus.com/inward/record.url?scp=85146461866&partnerID=8YFLogxK
U2 - 10.1007/s42979-022-01573-4
DO - 10.1007/s42979-022-01573-4
M3 - Article
SN - 2661-8907
VL - 4
SP - 1
EP - 23
JO - SN Computer Science
JF - SN Computer Science
IS - 2
M1 - 170
ER -