Abstract
In an era where there's a lot of information and a big demand for things to be done automatically, combining the power of understanding human language with computer programming has become really important. This research paper introduces an achievement in this domain. The software which we have developed can convert natural language problem statements into their equivalent Python code, hence making it easier to write code for a normal human. The heart of our software lies in the transformer model which is trained on an extensive corpus of diverse Python codes. This corpus encompasses a wide spectrum of programming concepts and syntactic structures, enabling our model to discern intricate patterns and nuances in the language of code. This idea is really important for things like quickly testing ideas, making software, and teaching. It helps connect regular language with computer language, making it easier for people and machines to work together. This could change the way we use computers in a big way. We have also done an analysis regarding the related works in this domain and shared our findings in this paper. Our methodology includes how we processed the data and the steps taken to build a fully functional software prototype. This section also offers the architecture used to build this software. A brief comparison between the existing solutions has also been done. In conclusion, this research represents a milestone in the pursuit of human–computer interaction. Our model has the potential to revolutionize the way programs are written.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Dehaerne E, Dey B, Halder S, De Gendt S, Meert W (2022) Code generation using machine learning: a systematic review. IEEE Access Pract Innov Open Solut 10:82434–82455. https://doi.org/10.1109/access.2022.3196347
Wang Y, Wang W, Joty S, Hoi SCH (2021) CodeT5: identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. arXiv [cs.CL]. http://arxiv.org/abs/2109.00859
Gemmell C, Rossetto F, Dalton J (2020) Relevance transformer: generating concise code snippets with relevance feedback. arXiv [cs.CL]. http://arxiv.org/abs/2007.02609
The School of AI (n.d.). https://theschoolof.ai/
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. arXiv [cs.CL]. http://arxiv.org/abs/1706.03762
Perez L, Ottens L, Viswanathan S (2021) Automatic code generation using pre-trained language models. arXiv [cs.CL]. http://arxiv.org/abs/2102.10535.
Feng Z, Guo D, Tang D, Duan N, Feng X, Gong M, Shou L, Qin B, Liu T, Jiang D, Zhou M (2020) CodeBERT: a pre-trained model for programming and natural languages. arXiv [cs.CL]. http://arxiv.org/abs/2002.08155
Stahlberg F, Kumar S (2020) Seq2Edits: sequence transduction using span-level edit operations. arXiv [cs.CL]. http://arxiv.org/abs/2009.11136
Zhu Z, Xue Z, Yuan Z (2019) Automatic graphics program generation using attention-based hierarchical decoder. arXiv. https://arxiv.org/abs/1810.11536
Vidhya K, Sarang SD, Sushma JC, Thanmaya C (n.d.) Automatic HTML code generation from mock-up images using machine learning techniques. Ijirt.org. Retrieved 16 May 2023, from https://ijirt.org/master/publishedpaper/IJIRT152018_PAPER.pdf
Allamanis M, Peng H, Sutton C (2016) A convolutional attention network for extreme summarization of source code. arXiv [cs.LG]. http://arxiv.org/abs/1602.03001
LeClair A, Haque S, Wu L, McMillan C (2020) Improved code summarization via a graph neural network. arXiv [cs.SE]. http://arxiv.org/abs/2004.02843
Shin R, Allamanis M, Brockschmidt M, Polozov O (2019) Program synthesis and semantic parsing with learned code idioms. arXiv [cs.LG]. http://arxiv.org/abs/1906.10816
Hata H, Shihab E, Neubig G (2018) Learning to generate corrective patches using neural machine translation. arXiv [cs.SE]. http://arxiv.org/abs/1812.07170
Grouwstra K (2020) Type-driven neural programming by example. arXiv [cs.SE]. http://arxiv.org/abs/2008.12613
Mukherjee R, Wen Y, Chaudhari D, Reps TW, Chaudhuri S, Jermaine C (2021) Neural program generation modulo static analysis. arXiv [cs.LG]. http://arxiv.org/abs/2111.01633
Bog M, Gaunt AL, Brockschmidt M, Nowozin S, Tarlow D (2016) DeepCoder: learning to write programs. arXiv [cs.LG]. http://arxiv.org/abs/1611.01989.al
Soliman A, Hadhoud M, Shaheen SI (2022) MarianCG: a code generation transformer model inspired by machine translation. J Eng Appl Sci 69. https://doi.org/10.1186/s44147-022-00159-4
Tipirneni S (2022) StructCoder: structure-aware transformer for code generation. arXiv.org. https://arxiv.org/abs/2206.05239
Studying the usage of text-to-text transfer transformer to support code-related tasks. IEEE Conference Publication. IEEE Xplore (2021). https://ieeexplore.ieee.org/abstract/document/9401982/
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Pavitha, N., Patrawala, A., Kulkarni, T., Talati, V., Dahiya, S. (2024). NL2Code: Harnessing Transformers for Automatic Code Generation from Natural Language Descriptions. In: Senjyu, T., So–In, C., Joshi, A. (eds) Smart Trends in Computing and Communications. SmartCom 2024 2024. Lecture Notes in Networks and Systems, vol 947. Springer, Singapore. https://doi.org/10.1007/978-981-97-1326-4_7
Download citation
DOI: https://doi.org/10.1007/978-981-97-1326-4_7
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-1325-7
Online ISBN: 978-981-97-1326-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)