NL2Code: Harnessing Transformers for Automatic Code Generation from Natural Language Descriptions

Pavitha, N.; Patrawala, Alimurtuza; Kulkarni, Tejas; Talati, Vidit; Dahiya, Shubham

doi:10.1007/978-981-97-1326-4_7

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 947))

Included in the following conference series:

International Conference on Smart Computing and Communication

353 Accesses

Abstract

In an era where there's a lot of information and a big demand for things to be done automatically, combining the power of understanding human language with computer programming has become really important. This research paper introduces an achievement in this domain. The software which we have developed can convert natural language problem statements into their equivalent Python code, hence making it easier to write code for a normal human. The heart of our software lies in the transformer model which is trained on an extensive corpus of diverse Python codes. This corpus encompasses a wide spectrum of programming concepts and syntactic structures, enabling our model to discern intricate patterns and nuances in the language of code. This idea is really important for things like quickly testing ideas, making software, and teaching. It helps connect regular language with computer language, making it easier for people and machines to work together. This could change the way we use computers in a big way. We have also done an analysis regarding the related works in this domain and shared our findings in this paper. Our methodology includes how we processed the data and the steps taken to build a fully functional software prototype. This section also offers the architecture used to build this software. A brief comparison between the existing solutions has also been done. In conclusion, this research represents a milestone in the pursuit of human–computer interaction. Our model has the potential to revolutionize the way programs are written.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+

from £29.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 159.50; Price includes VAT (United Kingdom)

Softcover Book: GBP 199.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

NL2Code: A Corpus and Semantic Parser for Natural Language to Code

Studying the difference between natural and programming language corpora

Article 11 January 2019

Text2PyCode: Machine Translation of Natural Language Intent to Python Source Code

References

Dehaerne E, Dey B, Halder S, De Gendt S, Meert W (2022) Code generation using machine learning: a systematic review. IEEE Access Pract Innov Open Solut 10:82434–82455. https://doi.org/10.1109/access.2022.3196347
Article Google Scholar
Wang Y, Wang W, Joty S, Hoi SCH (2021) CodeT5: identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. arXiv [cs.CL]. http://arxiv.org/abs/2109.00859
Gemmell C, Rossetto F, Dalton J (2020) Relevance transformer: generating concise code snippets with relevance feedback. arXiv [cs.CL]. http://arxiv.org/abs/2007.02609
The School of AI (n.d.). https://theschoolof.ai/
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. arXiv [cs.CL]. http://arxiv.org/abs/1706.03762
Perez L, Ottens L, Viswanathan S (2021) Automatic code generation using pre-trained language models. arXiv [cs.CL]. http://arxiv.org/abs/2102.10535.
Feng Z, Guo D, Tang D, Duan N, Feng X, Gong M, Shou L, Qin B, Liu T, Jiang D, Zhou M (2020) CodeBERT: a pre-trained model for programming and natural languages. arXiv [cs.CL]. http://arxiv.org/abs/2002.08155
Stahlberg F, Kumar S (2020) Seq2Edits: sequence transduction using span-level edit operations. arXiv [cs.CL]. http://arxiv.org/abs/2009.11136
Zhu Z, Xue Z, Yuan Z (2019) Automatic graphics program generation using attention-based hierarchical decoder. arXiv. https://arxiv.org/abs/1810.11536
Vidhya K, Sarang SD, Sushma JC, Thanmaya C (n.d.) Automatic HTML code generation from mock-up images using machine learning techniques. Ijirt.org. Retrieved 16 May 2023, from https://ijirt.org/master/publishedpaper/IJIRT152018_PAPER.pdf
Allamanis M, Peng H, Sutton C (2016) A convolutional attention network for extreme summarization of source code. arXiv [cs.LG]. http://arxiv.org/abs/1602.03001
LeClair A, Haque S, Wu L, McMillan C (2020) Improved code summarization via a graph neural network. arXiv [cs.SE]. http://arxiv.org/abs/2004.02843
Shin R, Allamanis M, Brockschmidt M, Polozov O (2019) Program synthesis and semantic parsing with learned code idioms. arXiv [cs.LG]. http://arxiv.org/abs/1906.10816
Hata H, Shihab E, Neubig G (2018) Learning to generate corrective patches using neural machine translation. arXiv [cs.SE]. http://arxiv.org/abs/1812.07170
Grouwstra K (2020) Type-driven neural programming by example. arXiv [cs.SE]. http://arxiv.org/abs/2008.12613
Mukherjee R, Wen Y, Chaudhari D, Reps TW, Chaudhuri S, Jermaine C (2021) Neural program generation modulo static analysis. arXiv [cs.LG]. http://arxiv.org/abs/2111.01633
Bog M, Gaunt AL, Brockschmidt M, Nowozin S, Tarlow D (2016) DeepCoder: learning to write programs. arXiv [cs.LG]. http://arxiv.org/abs/1611.01989.al
Soliman A, Hadhoud M, Shaheen SI (2022) MarianCG: a code generation transformer model inspired by machine translation. J Eng Appl Sci 69. https://doi.org/10.1186/s44147-022-00159-4
Tipirneni S (2022) StructCoder: structure-aware transformer for code generation. arXiv.org. https://arxiv.org/abs/2206.05239
Studying the usage of text-to-text transfer transformer to support code-related tasks. IEEE Conference Publication. IEEE Xplore (2021). https://ieeexplore.ieee.org/abstract/document/9401982/

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, Faculty of Science and Technology, Vishwakarma University, Pune, India
N. Pavitha, Alimurtuza Patrawala, Tejas Kulkarni, Vidit Talati & Shubham Dahiya

Authors

N. Pavitha
View author publications
Search author on:PubMed Google Scholar
Alimurtuza Patrawala
View author publications
Search author on:PubMed Google Scholar
Tejas Kulkarni
View author publications
Search author on:PubMed Google Scholar
Vidit Talati
View author publications
Search author on:PubMed Google Scholar
Shubham Dahiya
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to N. Pavitha .

Editor information

Editors and Affiliations

Faculty of Engineering, University of the Ryukyus, Nishihara, Okinawa, Japan
Tomonobu Senjyu
College of Computing, Khon Kaen University, Khon Kaen, Thailand
Chakchai So–In
Global Knowledge Research Foundation, Ahmedabad, Gujarat, India
Amit Joshi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pavitha, N., Patrawala, A., Kulkarni, T., Talati, V., Dahiya, S. (2024). NL2Code: Harnessing Transformers for Automatic Code Generation from Natural Language Descriptions. In: Senjyu, T., So–In, C., Joshi, A. (eds) Smart Trends in Computing and Communications. SmartCom 2024 2024. Lecture Notes in Networks and Systems, vol 947. Springer, Singapore. https://doi.org/10.1007/978-981-97-1326-4_7

Download citation

DOI: https://doi.org/10.1007/978-981-97-1326-4_7
Published: 02 June 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-1325-7
Online ISBN: 978-981-97-1326-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Keywords

Publish with us

Policies and ethics