Python in proteomics

Hannes L Rost

doi:10.7287/peerj.preprints.27736v1

Javascript is disabled in your browser. Please enable Javascript to view PeerJ.

NOT PEER-REVIEWED

"PeerJ Preprints" is a venue for early communication or feedback before peer review. Data may be preliminary.

Python in proteomics

Hannes L Rost ^1,2,3

1 University of Toronto, Donnelly Centre, Toronto, Ontario, Canada

2 University of Toronto, Department of Molecular Genetics, Toronto, Ontario, Canada

3 University of Toronto, Department of Computer Science, Toronto, Ontario, Canada

DOI: 10.7287/peerj.preprints.27736v1

Published: 2019-05-16
Accepted: 2019-05-16

Subject Areas: Bioinformatics, Computational Biology
Keywords: Python, Proteomics, Mass Spectrometry

Copyright: © 2019 Rost
Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.

Cite this article: Rost HL. 2019. Python in proteomics. PeerJ Preprints 7:e27736v1 https://doi.org/10.7287/peerj.preprints.27736v1

Abstract

Python is a versatile scripting language that is widely used in industry and academia. In bioinformatics, there are multiple packages supporting data analysis with Python that range from biological sequence analysis with Biopython to structural modeling and visualization with packages like PyMOL and PyRosetta, to numerical computation and advanced plotting with NumPy/SciPy. In the proteomics community, Python began to be widely used around 2012 when several mature Python packages were published including pymzML, Pyteomics and pyOpenMS. This has led to an ever-increasing interest in the Python programming language in the proteomics and mass spectrometry community. The number of publications referencing or using Python has risen eight fold since 2012 (compared with the same time period before 2012), with multiple open-source Python packages now supporting mass spectrometric data analysis and processing. Computing and data analysis in mass spectrometry is very diverse and in many cases must be tailored to a specific experiment. Often, multiple analysis steps have to be performed (identification, quantification, post-translational modification analysis, filtering, FDR analysis etc.) in an analysis pipeline, which requires high flexibility in the analysis. This is where Python truly shines, due to its flexibility, visualization capabilities and the ability to extend computation with a large number of powerful libraries. Python can be used to quickly prototype software, combine existing libraries into powerful analysis workflows while avoiding the trap of re- inventing the wheel for a new project.

Here, we will describe data analysis with Python using the pyOpenMS package. An extended documentation and tutorial can also be found online at https://pyopenms.readthedocs.io. To allow the reader to follow all steps in the tutorial, we will also describe the installation process of the software. Our installation is based on Anaconda, an open- source Python distribution that includes the Spyder integrated development environment (IDE) that allows development with pyOpenMS in a graphical environment.

Author Comment

Book chapter describing the use of Python in Proteomics

Add your feedback

Before adding feedback, consider if it can be asked as a question instead, and if so then use the Question tab. Pointing out typos is fine, but authors are encouraged to accept only substantially helpful feedback.

Some Markdown syntax is allowed: _italic_ **bold** ^superscript^ ~subscript~ %%blockquote%% [link text](link URL)

By posting this you agree to PeerJ's commenting policies

Questions

Ask a question

Learn more about Q&A

Links

Add a link

Content

Alert

Just enter your email

Add your feedback

Top referrals unique visitors

Share this preprint

Metrics

Download article