• Назимов Александр Михайлович (Nazimov A. M.) сотрудник Академии Федеральной службы охраны Российской Федерации, Орел, Россия
    Academy of the Federal Security Service of the Russian Federation
    s-nazim@list.ru
большие языковые моделиPolkitText2DSLструктурированный контекствалидация программ
large language modelsPolkitText2DSLstructured contextprogram validation
Русский

Цель исследования: формализация задачи автоматической генерации кода предметно-ориентированных языков (DSL) по описанию на естественном языке (Text2DSL) как самостоятельного класса задач и эмпирическая оценка роли структурированного контекста при генерации DSL-кода большой языковой моделью.

Методы исследования: эксперимент с двумя условиями (базовый режим и режим с контекстом) на датасете PolkitBench (4 204 верифицированные пары «запрос на естественном языке – правило Polkit»), трёхуровневая AST-валидация через парсер esprima, метрики синтаксической и семантической корректности.

Результаты исследования: включение структурированного контекста (BNF-грамматика, API-спецификация, словарь допустимых идентификаторов) повышает синтаксическую корректность с 80,5 % до 99,4 % (+23,4 %), семантическую корректность – с 60,4 % до 95,9 % (+58,7 %). Для класса задач Text2DSL включение формальной спецификации целевого языка в контекст запроса является необходимым и достаточным условием качественной генерации без дообучения модели.

Научная новизна: формализация задачи Text2DSL как отдельного класса задач генерации кода; датасет PolkitBench (4 204 верифицированные пары, трёхуровневая AST-валидация); эмпирическое обоснование критической роли структурированного контекста для качественной генерации DSL-кода.

English

Purpose of the study: to formalize the task of automatic generation of domain-specific language (DSL) code from natural language descriptions – referred to as Text2DSL – as an independent class of code generation problems, and to empirically evaluate the role of structured context in DSL code generation by a large language model.

Methods of research: the study is based on an experimental evaluation conducted under two conditions (baseline mode and context-enhanced mode) using the PolkitBench dataset, which contains 4,204 verified pairs of natural language requests and corresponding Polkit rules. A three-level AST validation procedure based on the esprima parser was employed, along with quantitative metrics of syntactic and semantic correctness.

Results: the inclusion of structured context (BNF grammar, API specification, and a dictionary of valid identifiers) increases syntactic correctness from 80.5 % to 99.4 % (+23.4 %) and semantic correctness from 60.4 % to 95.9 % (+58.7 %). The results demonstrate that for the class of Text2DSL tasks, incorporating the formal specification of the target language into the prompt context constitutes a necessary and sufficient condition for achieving high-quality DSL code generation without additional model fine-tuning.

Scientific novelty: the study formalizes the Text2DSL problem as a distinct class of code generation tasks; introduces the PolkitBench dataset consisting of 4,204 verified pairs validated through a three-level AST analysis; and provides empirical evidence of the critical role of structured context in enabling accurate DSL code generation by large language models.

Источники финансирования не указаны.

No funding sources reported.

DOI10.21681/3034-4050-2026-2-43-49 УДК004.82 ЖурналТелекоммуникации и связь Год2026 Номер№2 (11) Страницы43–49 ISSNПИ №ФС77-88069
  1. Rozière B., Gehring J., Gloeckle F., et al.. Code Llama: Open Foundation Models for Code // arXiv. 2023. DOI: 10.48550/arXiv.2308.12950.
    Rozière B., Gehring J., Gloeckle F., et al.. Code Llama: Open Foundation Models for Code // arXiv. 2023. DOI: 10.48550/arXiv.2308.12950.
  2. Li R., Allal L. B., Zi Y., et al.. StarCoder: May the Source Be with You! // Transactions on Machine Learning Research. 2023. DOI: 10.48550/arXiv.2305.06161.
    Li R., Allal L. B., Zi Y., et al.. StarCoder: May the Source Be with You! // Transactions on Machine Learning Research. 2023. DOI: 10.48550/arXiv.2305.06161.
  3. Zan D., Chen B., Zhang F., Lu D., Wu B., Guan B., Wang Y., Lou J.-G.. Large Language Models Meet NL2Code: A Survey // Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL). 2023. С. 7443–7464. DOI: 10.18653/v1/2023.acl-long.411.
    Zan D., Chen B., Zhang F., Lu D., Wu B., Guan B., Wang Y., Lou J.-G.. Large Language Models Meet NL2Code: A Survey // Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL). 2023. DOI: 10.18653/v1/2023.acl-long.411.
  4. Austin J., Odena A., Nye M., et al.. Program Synthesis with Large Language Models // arXiv. 2021. DOI: 10.48550/arXiv.2108.07732.
    Austin J., Odena A., Nye M., et al.. Program Synthesis with Large Language Models // arXiv. 2021. DOI: 10.48550/arXiv.2108.07732.
  5. Wang Y., Wang W., Joty S.. CodeT5: Identifier-Aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation // Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). 2021. DOI: 10.18653/v1/2021.emnlp-main.685.
    Wang Y., Wang W., Joty S.. CodeT5: Identifier-Aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation // Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). 2021. DOI: 10.18653/v1/2021.emnlp-main.685.
  6. Guo D., Ren S., Lu S., et al.. GraphCodeBERT: Pre-training Code Representations with Data Flow // International Conference on Learning Representations (ICLR). 2021. DOI: 10.48550/arXiv.2009.08366.
    Guo D., Ren S., Lu S., et al.. GraphCodeBERT: Pre-training Code Representations with Data Flow // International Conference on Learning Representations (ICLR). 2021. DOI: 10.48550/arXiv.2009.08366.
  7. Scholak T., Schucher N., Bahdanau D.. PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models // Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). 2021. DOI: 10.18653/v1/2021.emnlp-main.779.
    Scholak T., Schucher N., Bahdanau D.. PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models // Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). 2021. DOI: 10.18653/v1/2021.emnlp-main.779.
  8. Hu E. J., Shen Y., Wallis P., et al.. LoRA: Low-Rank Adaptation of Large Language Models // International Conference on Learning Representations (ICLR). 2022. DOI: 10.48550/arXiv.2106.09685.
    Hu E. J., Shen Y., Wallis P., et al.. LoRA: Low-Rank Adaptation of Large Language Models // International Conference on Learning Representations (ICLR). 2022. DOI: 10.48550/arXiv.2106.09685.
  9. Dettmers T., Pagnoni A., Holtzman A., Zettlemoyer L.. QLoRA: Efficient Finetuning of Quantized Large Language Models // Advances in Neural Information Processing Systems (NeurIPS). 2023. DOI: 10.48550/arXiv.2305.14314.
    Dettmers T., Pagnoni A., Holtzman A., Zettlemoyer L.. QLoRA: Efficient Finetuning of Quantized Large Language Models // Advances in Neural Information Processing Systems (NeurIPS). 2023. DOI: 10.48550/arXiv.2305.14314.
  10. Chen X., Lin Y., Schürmann C.. Neural Code Generation with Grammar Constraints // International Conference on Learning Representations (ICLR). 2021. DOI: 10.48550/arXiv.2010.00904.
    Chen X., Lin Y., Schürmann C.. Neural Code Generation with Grammar Constraints // International Conference on Learning Representations (ICLR). 2021. DOI: 10.48550/arXiv.2010.00904.
  11. Zhong R., Yin P., Yu T., et al.. Semantic Parsing for Code Generation: A Survey // Findings of the Association for Computational Linguistics (ACL Findings). 2022. DOI: 10.18653/v1/2022.findings-acl.2.
    Zhong R., Yin P., Yu T., et al.. Semantic Parsing for Code Generation: A Survey // Findings of the Association for Computational Linguistics (ACL Findings). 2022. DOI: 10.18653/v1/2022.findings-acl.2.
  12. Yin P., Neubig G.. StructCoder: Structure-Aware Code Generation with Language Models // Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL). 2022. DOI: 10.18653/v1/2022.acl-long.39.
    Yin P., Neubig G.. StructCoder: Structure-Aware Code Generation with Language Models // Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL). 2022. DOI: 10.18653/v1/2022.acl-long.39.
  13. Barke S., James M., Polikarpova N.. Grounded Copilot: How Programmers Interact with Code-Generating Models // Proceedings of the International Conference on Software Engineering (ICSE). 2023. DOI: 10.1109/ICSE48619.2023.00015.
    Barke S., James M., Polikarpova N.. Grounded Copilot: How Programmers Interact with Code-Generating Models // Proceedings of the International Conference on Software Engineering (ICSE). 2023. DOI: 10.1109/ICSE48619.2023.00015.
  14. Pourreza M., Rafiei D.. DIN-SQL: Decomposed In-Context Learning of Text-to-SQL with Self-Correction // Advances in Neural Information Processing Systems (NeurIPS). 2023. DOI: 10.48550/arXiv.2304.11015.
    Pourreza M., Rafiei D.. DIN-SQL: Decomposed In-Context Learning of Text-to-SQL with Self-Correction // Advances in Neural Information Processing Systems (NeurIPS). 2023. DOI: 10.48550/arXiv.2304.11015.