Suppose there is a process that is modeled by two linear regressions where exists a change point between the two regimes.

Here’s an example:

# Construct sample data
set.seed(121)
a1 <- 1
a2 <- 1.5
b1 <- 0
b2 <- -.15

n <- 101
changepoint <- 30

x  <- seq(0,1,len=n)
y1 <- a1 * x[1:changepoint]     + b1
y2 <- a2 * x[(changepoint+1):n] + b2
y  <- c(y1,y2) + rnorm(n,0,.05)

plot(x, y, pch=19)
abline(a=b1, b=a1, lwd=2, col="orange")
abline(a=b2, b=a2, lwd=2, col="blue")
abline(v=x[changepoint], lty=2)

We wish to find this change point using probabilistic programming. Herein, we’ll use BUGS.

library(BRugs)

run.model <- function(model, samples, data=list(), chainLength=10000, burnin=0.10, 
                      init.func, n.chains=1, thin=1) {
  
  writeLines(model, con="model.txt")  # Write the modelString to a file
  modelCheck( "model.txt" )           # Send the model to BUGS, which checks the model syntax
  if (length(data)>0)                 # If there's any data available...
    modelData(bugsData(data))         # ... BRugs puts it into a file and ships it to BUGS
  modelCompile(n.chains)              # BRugs command tells BUGS to compile the model
  
  if (missing(init.func)) {
    modelGenInits()                   # BRugs command tells BUGS to randomly initialize a chain
  } else {
    for (chain in 1:n.chains) {       # otherwise use user's init data
      modelInits(bugsInits(init.func))
    }
  }
  
  modelUpdate(chainLength*burnin)     # Burn-in period to be discarded
  samplesSet(samples)                 # BRugs tells BUGS to keep a record of the sampled values
  samplesSetThin(thin)                # Set thinning
  modelUpdate(chainLength)            # BRugs command tells BUGS to randomly initialize a chain
}

The Model

The model states that there is one change point, \(\tau\), that separates the two linear regimes.

Let’s assume that both regimes share the same variance (which might not be the case):

\[y_i \sim \mathcal{N}(a_1 x_i + b_1, \sigma^2), ~ ~ i \leq \tau\] \[y_i \sim \mathcal{N}(a_2 x_i + b_2, \sigma^2), ~ ~ i \gt \tau\]

\[a_i \sim \mathcal{N}(\alpha_a, \beta_a), ~ i=1,2\] \[b_i \sim \mathcal{N}(\alpha_b, \beta_b), ~ i=1,2\]

\[\tau \sim \text{DiscreteUniform}(x_{init}, x_{end})\]

Notice that parameters \(a_1\) and \(a_2\) are assumed to belong to similar distributions. This is reasonable, since if that was not the case, it would be easy to identify the changepoint and all this would not be needed.

I will not define \(\alpha, \beta\) as hyperparameters, but will just choose reasonable values looking at the available data.

This model can be written in Bugs like this:

modelString = "
    model {
      tau ~ dcat(xs[])        # the changepoint
      
      a1 ~ dnorm(1,3)
      a2 ~ dnorm(1,3)
      b1 ~ dnorm(0,2)
      b2 ~ dnorm(0,2)
  
      for(i in 1:N) {
        xs[i]  <- 1/N    # all x_i have equal priori probability to be the changepoint
  
        mu[i]  <- step(tau-i) * (a1*x[i] + b1) + 
                  step(i-tau-1) * (a2*x[i] + b2)

        phi[i] <- -log( 1/sqrt(2*pi*sigma2) * exp(-0.5*pow(y[i]-mu[i],2)/sigma2) ) + C

        dummy[i] <- 0
        dummy[i] ~ dpois( phi[i] )
      }
  
      sigma2 ~ dunif(0.001, 2)
      C  <- 100000
      pi <- 3.1416
    }
"

data.list = list(
  N = length(x),
  x = x,
  y = y
)

run.model(modelString, samples=c("tau", "a1", "a2", "b1", "b2", "sigma2"), 
          data=data.list, chainLength=5e4, burnin=0.2, n.chains=4, thin=2)
## model is syntactically correct
## data loaded
## model compiled
## initial values generated, model initialized
## 10000 updates took 29 s
## monitor set for variable 'tau'
## monitor set for variable 'a1'
## monitor set for variable 'a2'
## monitor set for variable 'b1'
## monitor set for variable 'b2'
## monitor set for variable 'sigma2'
## 50000 updates took 138 s

After running it, let’s extract the mcmc chains and get their summaries:

stats <- samplesStats(c("tau", "a1", "a2", "b1", "b2", "sigma2"))
stats
##             mean        sd  MC_error  val2.5pc    median val97.5pc start
## tau    34.770000 1.938e+01 7.305e-01 18.000000 29.000000 1.010e+02 10001
## a1      1.030000 3.790e-01 1.490e-02  0.736400  1.045000 1.384e+00 10001
## a2      1.421000 2.371e-01 9.366e-03  0.522100  1.484000 1.540e+00 10001
## b1     -0.006822 2.715e-02 1.024e-03 -0.070220 -0.002384 3.535e-02 10001
## b2     -0.114800 1.789e-01 7.088e-03 -0.220300 -0.148700 7.432e-01 10001
## sigma2  0.002167 4.056e-04 1.043e-05  0.001581  0.002099 3.196e-03 10001
##        sample
## tau    100000
## a1     100000
## a2     100000
## b1     100000
## b2     100000
## sigma2 100000
tau.hat <- stats[1,5]
a1.hat  <- stats[2,5]
a2.hat  <- stats[3,5]
b1.hat  <- stats[4,5]
b2.hat  <- stats[5,5]

plot(x, y, pch=19)
abline(a=b1,     b=a1,     lwd=1, lty=2, col="orange")
abline(a=b2,     b=a2,     lwd=1, lty=2, col="blue")
abline(a=b1.hat, b=a1.hat, lwd=2,        col="orange")
abline(a=b2.hat, b=a2.hat, lwd=2,        col="blue")
abline(v=x[changepoint], lty=2, col="grey")
abline(v=x[tau.hat],     lty=2)

We can also use some tools to check if the produced chains are reasonable:

library(coda)

samplesHistory("tau", mfrow = c(1, 1))

samplesAutoC("tau", mfrow = c(1, 1), 1)

chain <- samplesSample("tau")
hist(chain[2e4:10e4], breaks=40, prob=T)

Notice that there’s some probabilistic mass over 100 and a bit over zero. This is not unexpected since, for this data, the transition at the changepoint is smooth enough to consider the hypothesis that there’s no changepoint.

Using extra information

If we know that both linear regimes meet at \(x[\tau]\) we can include that information in the model. Also, here I’ll replace \(a_2\) with a delta over \(a_1\):

\[y_i \sim \mathcal{N}(a_1 x_i + b_1, \sigma^2), ~ ~ i \leq \tau\] \[y_i \sim \mathcal{N}((a_1+\delta) x_i + (b_1 - \delta x[\tau]), \sigma^2), ~ ~ i \gt \tau\]

\[a_1 \sim \mathcal{N}(\alpha_a, \beta_a)\] \[b_1 \sim \mathcal{N}(\alpha_b, \beta_b)\]

\[\delta \sim \mathcal{N}(0, 2),\]

\[\tau \sim \text{DiscreteUniform}(x_{init}, x_{end})\]

modelString = "
    model {
      tau ~ dcat(xs[])        # the changepoint
      
      a1 ~ dnorm(1,3)
      b1 ~ dnorm(0,2)

      delta ~ dnorm(0,1)

      for(i in 1:N) {
        xs[i]  <- 1/N    # all x_i have equal priori probability to be the changepoint
  
        mu[i]  <- step(tau-i)   * (  a1      *x[i] +     b1       ) + 
                  step(i-tau-1) * ((a1+delta)*x[i] + (b1 - delta*x[tau]))

        phi[i] <- -log( 1/sqrt(2*pi*sigma2) * exp(-0.5*pow(y[i]-mu[i],2)/sigma2) ) + C

        dummy[i] <- 0
        dummy[i] ~ dpois( phi[i] )
      }
  
      sigma2 ~ dunif(0.001, 2)
      C  <- 100000
      pi <- 3.1416
    }
"

data.list = list(
  N = length(x),
  x = x,
  y = y
)

run.model(modelString, samples=c("tau", "a1", "b1", "delta", "sigma2"), 
          data=data.list, chainLength=5e4, burnin=0.2, n.chains=1, thin=2)
## model is syntactically correct
## data loaded
## model compiled
## initial values generated, model initialized
## 10000 updates took 7 s
## monitor set for variable 'tau'
## monitor set for variable 'a1'
## monitor set for variable 'b1'
## monitor set for variable 'delta'
## monitor set for variable 'sigma2'
## 50000 updates took 50 s
stats <- samplesStats(c("tau", "a1", "b1", "delta", "sigma2"))
stats
##             mean        sd  MC_error  val2.5pc    median val97.5pc start
## tau    32.930000 4.4120000 2.881e-01 26.000000 32.000000 44.000000 10001
## a1      1.006000 0.0901000 6.173e-03  0.829100  1.005000  1.167000 10001
## b1      0.001080 0.0165000 1.037e-03 -0.031720  0.001525  0.033460 10001
## delta   0.482300 0.0935200 6.313e-03  0.308100  0.483400  0.664500 10001
## sigma2  0.002074 0.0003039 4.868e-06  0.001559  0.002047  0.002743 10001
##        sample
## tau     25000
## a1      25000
## b1      25000
## delta   25000
## sigma2  25000
tau.hat    <- stats[1,5]
a1.hat     <- stats[2,5]
b1.hat     <- stats[3,5]
delta.hat  <- stats[4,5]

plot(x, y, pch=19)
abline(a=b1, b=a1, lwd=1, lty=2, col="orange")
abline(a=b2, b=a2, lwd=1, lty=2, col="blue")

abline(a=b1,                      b=a1,           lwd=2, col="orange")
abline(a=b1-delta.hat*x[tau.hat], b=a1+delta.hat, lwd=2, col="blue")

abline(v=x[changepoint], lty=2, col="grey")
abline(v=x[tau.hat],     lty=2)