USMI Galaxy Demonstrator (UGD): a collection of tools to integrate microorganisms information
- Published
- Accepted
- Subject Areas
- Bioinformatics, Microbiology
- Keywords
- MIRRI, Galaxy, bioinformatics tools, data integration, microorganism
- Copyright
- © 2017 Colobraro et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
- Cite this article
- 2017. USMI Galaxy Demonstrator (UGD): a collection of tools to integrate microorganisms information. PeerJ Preprints 5:e2766v1 https://doi.org/10.7287/peerj.preprints.2766v1
Abstract
Due to the fragmentation of microbial information and the several branch of human activities encompassed by microorganism applications, a comprehensive approach for merging information on microbes is needed. Although on line service providers collect several data on microorganisms and provide services for microbial Biological Resource Centres (mBRCs), such services are still limited both in contents and aims. The USMI Galaxy Demonstrator (UGD), an implementation of the Galaxy framework exploiting the XML-based Microbiological Common Language (MCL), is meant to support researchers to make an integrated access to enriched information from microbial catalogues, as well as to help mBRC curators in validating and enriching the contents of their catalogues. Researchers and mBRC curators may exploit the UGD to avoid manual, potentially long, searches on the web and to identify and select microorganisms of interest.
UGD tools are written in Python, version 2.7. They allow to enrich the basic information provided by catalogues with related taxonomy, literature, sequence and chemical compound data retrieved from some of the main databases on the basis of the strain number, i.e. the unique identifier for a given culture, and the species names. The data is retrieved by querying database Web Services using either the Simple Object Access Protocol (SOAP) or the Representational State Transfer (REST) access protocols. The MCL format provides a versatile way to archive and exchange data among mBRCs.
Galaxy is a well-known, open, web-based platform which offers many tools to retrieve, manage and analyze different kind of information arising from any life science domain. By exploiting Galaxy flexibility,UGD implements some tools and workflows that can be used to find and integrate several information on microorganisms. UGD tools integrate basic information which may support mBRC staff in the insertion of all fundamental strain information in a proper format allowing integration and interoperability with external databases. They also extend the output by adding information on source materials, including species and strain numbers, and retrieve associated microorganisms which use a compound or an enzyme in whatever metabolic pathway by returning the accession number, synonyms, links to external databases, taxon name, and strain number of the requested molecule.
Author Comment
This abstract was accepted for the CHARME / EMBnet / NETTAB 2016 Workshop.