Finding and correcting syntax errors using recurrent neural networks

Eddie A Santos; Joshua C Campbell; Abram Hindle; José Nelson Amaral

doi:10.7287/peerj.preprints.3123v1

Javascript is disabled in your browser. Please enable Javascript to view PeerJ.

NOT PEER-REVIEWED

"PeerJ Preprints" is a venue for early communication or feedback before peer review. Data may be preliminary.

Finding and correcting syntax errors using recurrent neural networks

Eddie A Santos , Joshua C Campbell , Abram Hindle, José Nelson Amaral

Computing Science, University of Alberta, Edmonton, Alberta, Canada

DOI: 10.7287/peerj.preprints.3123v1

Published: 2017-08-03
Accepted: 2017-08-03

Subject Areas: Data Mining and Machine Learning, Software Engineering
Keywords: syntax error, deep learning, program repair, n-gram, JavaScript, GitHub, RNN, LSTM, syntax error correction, neural network

Copyright: © 2017 Santos et al.
Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.

Cite this article: Santos EA, Campbell JC, Hindle A, Amaral JN. 2017. Finding and correcting syntax errors using recurrent neural networks. PeerJ Preprints 5:e3123v1 https://doi.org/10.7287/peerj.preprints.3123v1

Abstract

Minor syntax errors are made by novice and experienced programmers alike; however, novice programmers lack the years of intuition that help them resolve these tiny errors. Standard LR parsers typically resolve syntax errors and their precise location poorly. We propose a methodology that helps locate where syntax errors occur, but also suggests possible changes to the token stream that can fix the error identified. This methodology finds syntax errors by checking if two language models “agree” on each token. If the models disagree, it indicates a possible syntax error; the methodology tries to suggest a fix by finding an alternative token sequence obtained from the models. We trained two LSTM (Long short-term memory) language models on a large corpus of JavaScript code collected from GitHub. The dual LSTM neural network model predicts the correct location of the syntax error 54.74% in its top 4 suggestions and produces an exact fix up to 35.50% of the time. The results show that this tool and methodology can locate and suggest corrections for syntax errors. Our methodology is of practical use to all programmers, but will be especially useful to novices frustrated with incomprehensible syntax errors.

Author Comment

This article details an early version of our work on syntax error correction using deep learning neural networks. We are revamping the manuscript with a better baseline comparison, and an evaluation against true novice mistakes—in addition to the mutation evaluation presented in this revision.

Since we have started work on these improvements, two similar works have appeared:

Kruthiventi, S. S., Ayush, K., & Babu, R. V. (2017). Deepfix: A fully convolutional neural network for predicting human eye fixations. IEEE Transactions on Image Processing.

Bhatia, S., & Singh, R. (2016). Automated correction for syntax errors in programming assignments using recurrent neural networks. arXiv preprint arXiv:1603.06129.

Add your feedback

Before adding feedback, consider if it can be asked as a question instead, and if so then use the Question tab. Pointing out typos is fine, but authors are encouraged to accept only substantially helpful feedback.

Some Markdown syntax is allowed: _italic_ **bold** ^superscript^ ~subscript~ %%blockquote%% [link text](link URL)

By posting this you agree to PeerJ's commenting policies

Questions

Ask a question

Learn more about Q&A

Links

Add a link

Content

Alert

Just enter your email

Add your feedback

Top referrals unique visitors

Share this preprint

Metrics

Download article