Identifying discontinuities (or

See also the supplemental material, for further analytic examples, a sensitivity analysis of the described algorithm, implementation of two operational modifications, further mathematical support, and a Matlab package that includes all data presented in the manuscript and implements the algorithm as a function.

This version includes a number of small changes throughout the manuscript.

Reaction times from a single subject learning a psychophysical task, originally reported by Palmeri (1997). The dashed line corresponds to a four-parameter “learning curve,” reported by Heathcote et al. (2000), while the solid lines interpret the same data as approximately linear, with two change-points.

Prior hyperparameters _{0} and _{1} (left) and posterior hyperparameters _{0}′ and _{1}′ for the beta distribution given three different prior distributions and the same empirical observations. See Equation 2 and the Supplement for details.

Likelihood as a function of the probability θ for the binomial distribution, given 20 observations, consisting of 10 success and 10 failures. Darker areas indicate higher relative likelihoods. (Left Panel) When a single parameter is used to describe the data, the maximum likelihood estimator is _{hat} = ^{10}/_{20} = 0.5. (Center Panel) The first ten observations and the second ten observations are examined independently, and assigned their own MLE parameters _{hat_1} = ^{4}/_{10} = 0.4 and _{hat_2} = ^{6}/_{10} = 0.6. This increase in model complexity improves the maximum likelihood, but also results in a smaller (and therefore less favorable) marginal model likelihood than that observed for the one-parameter model. (Right Panel) If splitting the observations into two groups instead reveals that _{hat_1} = ^{2}/_{10} = 0.2 and _{hat_2} = ^{8}/_{10} = 0.8, this results in greater maximum and marginal model likelihoods. All estimations in this figure assume the prior hyperparameters _{0} = _{1} = 1. On the basis of these results, choosing a two-parameter model is justified in the scenario depicted in the right panel, but not for the center panel, despite both having higher maximum likelihoods than the one-parameter model in the left panel.

An example of uniform vs. non-uniform segmentation. (Left) In the uniform case, each segment of time is an equally strong candidate for a putative change-point. In this example, the nine segments each have a common weight of ^{1}/_{9} . (Right) In the non-uniform case, a change-point is presumed to be located somewhere between _{(1)} and _{(10)}, and if any point along that interval is as likely as any other, then longer segments are correspondingly more likely to contain change-points.

Values of log(_{(c)}·_{(c)} given 100 binary observations [0,1,0,1,··· ,0,1] both uncorrected (left) and corrected (right). The dashed line represents the boundary between evidence favoring a change at time (positive) vs. evidence against (negative) at event

Values of log (_{(c)} · _{(c)} ) given 1000 randomized Gaussian observations (

The CPR algorithm applied to a binary string. (Top) The individual Bayes factors k(c) (here presented on a log scale) are calculated for each point, and there is a clear peak when ^{p1′}/_{p0'} for a segment favors a change, which is not the case for either segment in the bottom panel.

Plot of item acquisition in 25 different lists in Jensen et al. (2013) , learned using the Simultaneous Chain procedure. (Left) Estimated time of acquisition for each list item is indicated both by elevation and by shade of gray. (Right) Histogram showing frequency of inter-acquisition periods, in trials, for each item, as well as the number of trials overall needed to acquire all list items. Bars do not sum to 100%, because acquisition did not occur in all lists.

Reaction times as a function of trials and task difficulty in Palmeri (1997). (Top) Each of the 6132 reaction times, color-coded according to their speed and positioned with respect to trial and stimulus numerosity. (Bottom) The model fit resulting from a multiple regression, subdivided according to the CPR algorithm.

Reaction times as a function of trials in Palmeri (1997) for specific stimuli in a single participant. Lines represent Bayesian regression estimates whose subdivisions were determined using to the change-point algorithm. (Left) 6-dot stimulus learning for most rapidly acquired stimulus (blue) and the least rapid (red). (Right) 11-dot stimulus learning for most rapidly acquired stimulus (blue) and the least rapid (red).

3D position data collected by Kaluža et al. (2010). Individual points are multivariate, such that an observation an

3D depiction of the a subset of the movement data from Kaluža et al. (2010) and its associated change-point model. Individual points represent discrete observations and are color-coded continuously with respect to time, as noted on the color bar. The points on the color bar denote the times at which change-points were detected by the CPR algorithm. Ellipses represent the means and covariance associated with particular segments of data, estimated post-hoc using the robust method described by Campbell (1980). The thin colored line, which is drawn along the horizontal plane, represents a moving average of 50 points.

Detailed plot of the data presented in Figure 11, represented in terms of recorded sensor values alone each axis. Gray zones are identified by Kaluža et al. as ‘transitional periods’ (such as falling or sitting down), while dashed lines indicated change-points detected by the CPR algorithm, given a multivariate normal model.

Detailed plot of the data presented in Figure 11, re-analyzed using a multiple linear regression model. The thin, pale lines represent the raw data, while the thick lines correspond to the best-fitting linear fits. Gray zones are identified by Kaluža et al. as ‘transitional periods’ (such as falling or sitting down), while dashed lines indicated change- points detected by the CPR algorithm, given a multiple regression model.

A variety of non-psychological datasets are analyzed to demonstrate broad applications of the

British coal-mining disasters between 1851 and 1962, represented as days between disasters, which are approximately exponentially distributed (left) and number of disasters per year, which are approximately Poisson distributed (right). In both cases, the CPR algorithm, using the appropriate rule-of-thumb empirical prior, finds a single change-point. Each plot shows its distribution’s probability function as a color map, based on the posterior parameter estimates.

Nuclear-magnetic response of rock strata in an oil well bore-hole, reported by Ó Ruanaidh & Fitzgerald (1996). Points correspond to individual observations. The red line indicates the estimated mean (with one standard deviation shown in light red) within segments identified by the CPR algorithm.

Nominal rates on three-month U.S. Treasury bills, calculated monthly from June 1947 to February 2013. (Top) Nominal rates, with red lines displaying the model inferred by a linear regression change-point analysis. (Bottom) First differences of nominal rates (_{i} − _{i−1}), with the means (red) and one standard deviation (light red) from a Gaussian change-point analysis.

Sensitivity analysis performed on two datasets from the main body text and three from the supplement. The decision criterion

Simulated data generated using a chaotic distribution of Gaussian parameters derived from Equation S4. (Left) The full dataset

Change-points identified in simulated data using the alternative strategies laid out for dealing with stationary data. (Top) Trials for which change- points were identified, as a function of the number of subdivisions examined by the dicing operation. (Center) A histogram of how often, across 30 values for the dicing parameter, a precise change-point was identified. (Bottom) Trials for which change-points were identified using the forward-retrospective algorithm.

Feature-rich Matlab function implementing the CPR algorithm for the ten conjugate priors described in the Supplement. The forward-retrospective variant described in the Supplement is implemented as a second function.

All data presented in the manuscript and supplement are included in Matlab format, as well as basic code for performing the analyses using the provided Matlab functions.

The author declares no competing interests.

The following grant information was disclosed by the authors:

NIMH grant (5-R01-MH068073-10) awarded to Peter Balsam.