This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Pathway and cell-type signatures are patterns present in transcriptome data that are associated with biological processes or phenotypic consequences. These signatures result from specific cell-type and pathway expression, but can require large transcriptomic compendia to detect. Machine learning techniques can be powerful tools in a practitioner’s toolkit for signature discovery through their ability to provide accurate and interpretable results. In the following review, we discuss various machine learning applications to extract pathway and cell-type signatures from transcriptomic compendia. We focus on the biological motivations and interpretation for both supervised and unsupervised learning approaches in this setting. We consider recent advances, including deep learning, and their applications to expanding bulk and single cell RNA data. As data and compute resources increase, opportunities for machine learning to aid in revealing biological signatures will continue to grow.