PromoterPredict: sequence-based modelling of Escherichia coli σ70 promoter strength yields logarithmic dependence between promoter strength and sequence

Biotechnology, Sri Venkateswara College of Engineering, Sriperumbudur, Tamil Nadu, India
Computer Science and Engineering, Sri Venkateswara College of Engineering, Sriperumbudur, Tamil Nadu, India
School of Chemical and BioTechnology, SASTRA Deemed University, Thanjavur, Tamil Nadu 613401, India
DOI
10.7287/peerj.preprints.26759v2
Subject Areas
Bioengineering, Bioinformatics, Biotechnology, Data Mining and Machine Learning
Keywords
Bioengineering, Promoter analysis, Machine learning, Web service, Biotechnology, Bioinformatics, Algorithms, Housekeeping genes
Copyright
© 2018 Bharanikumar et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Cite this article
Bharanikumar R, R Premkumar KA, Palaniappan A. 2018. PromoterPredict: sequence-based modelling of Escherichia coli σ70 promoter strength yields logarithmic dependence between promoter strength and sequence. PeerJ Preprints 6:e26759v2

Abstract

We present PromoterPredict, a dynamic multiple regression approach to predict the strength of Escherichia coli promoters binding the σ70 factor of RNA polymerase. σ70 promoters are ubiquitously used in recombinant DNA technology, but characterizing their strength is demanding in terms of both time and money. Using a well-characterized set of promoters, we trained a multivariate linear regression model and found that the log of the promoter strength is significantly linearly associated with a weighted sum of the –10 and –35 sequence profile scores. It was found that the two regions contributed almost equally to the promoter strength. PromoterPredict accepts –10 and –35 hexamer sequences and returns the predicted promoter strength. It is capable of dynamic learning from user-supplied data to refine the model construction and yield more confident estimates of promoter strength. PromoterPredict is available as both a web service ( https://promoterpredict.com ) and standalone tool ( https://github.com/PromoterPredict ). Our work presents an intuitive generalization applicable to modelling the strength of other promoter classes.

Author Comment

Changed fig. 4 to table; included link to bioinformatics tool within Abstract. Minor editing of manuscript (typos etc.)