Detecting periodicities with Gaussian processes

Institut Fayol - LIMOS, Mines Saint-Étienne, Saint-Étienne, France
CHICAS, Faculty of Health and Medicine, Lancaster University, Lancaster, United Kingdom
Faculty of Life Sciences, University of Manchester, Manchester, United Kingdom
Department of Computer Science and Sheffield Institute for Translational Neuroscience, University of Sheffield, Sheffield, United Kingdom
DOI
10.7287/peerj.preprints.1743v1
Subject Areas
Data Mining and Machine Learning, Optimization Theory and Computation
Keywords
RKHS, Harmonic analysis, circadian rhythm, gene expression, Matérn kernels
Copyright
© 2016 Durrande et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.
Cite this article
Durrande N, Hensman J, Rattray M, Lawrence ND. 2016. Detecting periodicities with Gaussian processes. PeerJ PrePrints 4:e1743v1

Abstract

We consider the problem of detecting and quantifying the periodic component of a function given noise-corrupted observations of a limited number of input/output tuples. Our approach is based on Gaussian process regression which provides a flexible non-parametric framework for modelling periodic data. We introduce a novel decomposition of the covariance function as the sum of periodic and aperiodic kernels. This decomposition allows for the creation of sub-models which capture the periodic nature of the signal and its complement. To quantify the periodicity of the signal, we derive a periodicity ratio which reflects the uncertainty in the fitted sub-models. Although the method can be applied to many kernels, we give a special emphasis to the Matérn family, from the expression of the reproducing kernel Hilbert space inner product to the implementation of the associated periodic kernels in a Gaussian process toolkit. The proposed method is illustrated by considering the detection of periodically expressed genes in the arabidopsis genome.

Author Comment

This is a submission to PeerJ for review.

Supplemental Information

arabidopsis dataset

Original dataset with the gene expressions for each gene at each time point.

DOI: 10.7287/peerj.preprints.1743v1/supp-1

Arabidopsis results

File regrouping the available results from~\cite{edwards2006flowering} and the one obtained in the application section. For both methods, the file gives the value of the criterion and the estimated period.

DOI: 10.7287/peerj.preprints.1743v1/supp-2

Script files for generating Figures 1 to 3

This Python code is provided as Jupyter notebooks to include detailed comments.

DOI: 10.7287/peerj.preprints.1743v1/supp-3