Цель исследования: формализация задачи автоматической генерации кода предметно-ориентированных языков (DSL) по описанию на естественном языке (Text2DSL) как самостоятельного класса задач и эмпирическая оценка роли структурированного контекста при генерации DSL-кода большой языковой моделью.
Методы исследования: эксперимент с двумя условиями (базовый режим и режим с контекстом) на датасете PolkitBench (4 204 верифицированные пары «запрос на естественном языке – правило Polkit»), трёхуровневая AST-валидация через парсер esprima, метрики синтаксической и семантической корректности.
Результаты исследования: включение структурированного контекста (BNF-грамматика, API-спецификация, словарь допустимых идентификаторов) повышает синтаксическую корректность с 80,5 % до 99,4 % (+23,4 %), семантическую корректность – с 60,4 % до 95,9 % (+58,7 %). Для класса задач Text2DSL включение формальной спецификации целевого языка в контекст запроса является необходимым и достаточным условием качественной генерации без дообучения модели.
Научная новизна: формализация задачи Text2DSL как отдельного класса задач генерации кода; датасет PolkitBench (4 204 верифицированные пары, трёхуровневая AST-валидация); эмпирическое обоснование критической роли структурированного контекста для качественной генерации DSL-кода.
Purpose of the study: to formalize the task of automatic generation of domain-specific language (DSL) code from natural language descriptions – referred to as Text2DSL – as an independent class of code generation problems, and to empirically evaluate the role of structured context in DSL code generation by a large language model.
Methods of research: the study is based on an experimental evaluation conducted under two conditions (baseline mode and context-enhanced mode) using the PolkitBench dataset, which contains 4,204 verified pairs of natural language requests and corresponding Polkit rules. A three-level AST validation procedure based on the esprima parser was employed, along with quantitative metrics of syntactic and semantic correctness.
Results: the inclusion of structured context (BNF grammar, API specification, and a dictionary of valid identifiers) increases syntactic correctness from 80.5 % to 99.4 % (+23.4 %) and semantic correctness from 60.4 % to 95.9 % (+58.7 %). The results demonstrate that for the class of Text2DSL tasks, incorporating the formal specification of the target language into the prompt context constitutes a necessary and sufficient condition for achieving high-quality DSL code generation without additional model fine-tuning.
Scientific novelty: the study formalizes the Text2DSL problem as a distinct class of code generation tasks; introduces the PolkitBench dataset consisting of 4,204 verified pairs validated through a three-level AST analysis; and provides empirical evidence of the critical role of structured context in enabling accurate DSL code generation by large language models.
Источники финансирования не указаны.
No funding sources reported.
-
Rozière B., Gehring J., Gloeckle F., et al.. Code Llama: Open Foundation Models for Code // arXiv. 2023. DOI:
10.48550/arXiv.2308.12950.
Rozière B., Gehring J., Gloeckle F., et al.. Code Llama: Open Foundation Models for Code // arXiv. 2023. DOI: 10.48550/arXiv.2308.12950. -
Li R., Allal L. B., Zi Y., et al.. StarCoder: May the Source Be with You! // Transactions on Machine Learning Research.
2023. DOI: 10.48550/arXiv.2305.06161.
Li R., Allal L. B., Zi Y., et al.. StarCoder: May the Source Be with You! // Transactions on Machine Learning Research. 2023. DOI: 10.48550/arXiv.2305.06161. -
Zan D., Chen B., Zhang F., Lu D., Wu B., Guan B., Wang Y., Lou J.-G.. Large Language Models Meet NL2Code: A Survey //
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL). 2023. С. 7443–7464. DOI:
10.18653/v1/2023.acl-long.411.
Zan D., Chen B., Zhang F., Lu D., Wu B., Guan B., Wang Y., Lou J.-G.. Large Language Models Meet NL2Code: A Survey // Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL). 2023. DOI: 10.18653/v1/2023.acl-long.411. -
Austin J., Odena A., Nye M., et al.. Program Synthesis with Large Language Models // arXiv. 2021. DOI:
10.48550/arXiv.2108.07732.
Austin J., Odena A., Nye M., et al.. Program Synthesis with Large Language Models // arXiv. 2021. DOI: 10.48550/arXiv.2108.07732. -
Wang Y., Wang W., Joty S.. CodeT5: Identifier-Aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and
Generation // Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). 2021. DOI:
10.18653/v1/2021.emnlp-main.685.
Wang Y., Wang W., Joty S.. CodeT5: Identifier-Aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation // Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). 2021. DOI: 10.18653/v1/2021.emnlp-main.685. -
Guo D., Ren S., Lu S., et al.. GraphCodeBERT: Pre-training Code Representations with Data Flow // International Conference
on Learning Representations (ICLR). 2021. DOI: 10.48550/arXiv.2009.08366.
Guo D., Ren S., Lu S., et al.. GraphCodeBERT: Pre-training Code Representations with Data Flow // International Conference on Learning Representations (ICLR). 2021. DOI: 10.48550/arXiv.2009.08366. -
Scholak T., Schucher N., Bahdanau D.. PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language
Models // Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). 2021. DOI:
10.18653/v1/2021.emnlp-main.779.
Scholak T., Schucher N., Bahdanau D.. PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models // Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). 2021. DOI: 10.18653/v1/2021.emnlp-main.779. -
Hu E. J., Shen Y., Wallis P., et al.. LoRA: Low-Rank Adaptation of Large Language Models // International Conference on
Learning Representations (ICLR). 2022. DOI: 10.48550/arXiv.2106.09685.
Hu E. J., Shen Y., Wallis P., et al.. LoRA: Low-Rank Adaptation of Large Language Models // International Conference on Learning Representations (ICLR). 2022. DOI: 10.48550/arXiv.2106.09685. -
Dettmers T., Pagnoni A., Holtzman A., Zettlemoyer L.. QLoRA: Efficient Finetuning of Quantized Large Language Models //
Advances in Neural Information Processing Systems (NeurIPS). 2023. DOI: 10.48550/arXiv.2305.14314.
Dettmers T., Pagnoni A., Holtzman A., Zettlemoyer L.. QLoRA: Efficient Finetuning of Quantized Large Language Models // Advances in Neural Information Processing Systems (NeurIPS). 2023. DOI: 10.48550/arXiv.2305.14314. -
Chen X., Lin Y., Schürmann C.. Neural Code Generation with Grammar Constraints // International Conference on Learning
Representations (ICLR). 2021. DOI: 10.48550/arXiv.2010.00904.
Chen X., Lin Y., Schürmann C.. Neural Code Generation with Grammar Constraints // International Conference on Learning Representations (ICLR). 2021. DOI: 10.48550/arXiv.2010.00904. -
Zhong R., Yin P., Yu T., et al.. Semantic Parsing for Code Generation: A Survey // Findings of the Association for
Computational Linguistics (ACL Findings). 2022. DOI: 10.18653/v1/2022.findings-acl.2.
Zhong R., Yin P., Yu T., et al.. Semantic Parsing for Code Generation: A Survey // Findings of the Association for Computational Linguistics (ACL Findings). 2022. DOI: 10.18653/v1/2022.findings-acl.2. -
Yin P., Neubig G.. StructCoder: Structure-Aware Code Generation with Language Models // Proceedings of the Annual Meeting of
the Association for Computational Linguistics (ACL). 2022. DOI: 10.18653/v1/2022.acl-long.39.
Yin P., Neubig G.. StructCoder: Structure-Aware Code Generation with Language Models // Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL). 2022. DOI: 10.18653/v1/2022.acl-long.39. -
Barke S., James M., Polikarpova N.. Grounded Copilot: How Programmers Interact with Code-Generating Models // Proceedings of
the International Conference on Software Engineering (ICSE). 2023. DOI: 10.1109/ICSE48619.2023.00015.
Barke S., James M., Polikarpova N.. Grounded Copilot: How Programmers Interact with Code-Generating Models // Proceedings of the International Conference on Software Engineering (ICSE). 2023. DOI: 10.1109/ICSE48619.2023.00015. -
Pourreza M., Rafiei D.. DIN-SQL: Decomposed In-Context Learning of Text-to-SQL with Self-Correction // Advances in Neural
Information Processing Systems (NeurIPS). 2023. DOI: 10.48550/arXiv.2304.11015.
Pourreza M., Rafiei D.. DIN-SQL: Decomposed In-Context Learning of Text-to-SQL with Self-Correction // Advances in Neural Information Processing Systems (NeurIPS). 2023. DOI: 10.48550/arXiv.2304.11015.