Circular Statistics with BUGS

Introduction to ploting tools
The von Mises Distribution
BUGS model I: estimate mean direction with known concentration
BUGS model II: estimate mean direction using a MLE concentration
BUGS model III: estimate mean direction and concentration
BUGS model IV: mixture of von Mises with known concentrations
The Wrapped Cauchy Distribution
BUGS model I : estimate mean and concentration
BUGS model II: mixture of wrapped Cauchys
BUGS model III: mixture of von Mises and wrapped Cauchy
Jones and Pewsey distribution
Stan Model
Nonparametric Methods

library(circular) # circular stats, cran.r-project.org/web/packages/circular/circular.pdf

Introduction to ploting tools

Refs:

http://www.rpubs.com/mattbagg/circular

Generate some random temporal data:

library(lubridate)  # deal with date values

## 
## Attaching package: 'lubridate'

## The following object is masked from 'package:base':
## 
##     date

N <- 500
events <- as.POSIXct("2011-01-01", tz="GMT") + 
              days(floor(365*runif(N)))      + 
              hours(floor(24*rnorm(N)))      +  
              minutes(floor(60*runif(N))   ) +
              seconds(floor(60*runif(N)))
head(events)

## [1] "2011-11-18 17:47:02 GMT" "2011-08-05 23:22:30 GMT"
## [3] "2011-06-22 16:22:01 GMT" "2011-06-05 14:47:16 GMT"
## [5] "2011-10-18 12:21:06 GMT" "2011-07-26 16:40:32 GMT"

df <- data.frame(datetime = events, eventhour = hour(events))
df$work <- df$eventhour %in% seq(9, 17) # determine if event is in business hours (9am to 17pm)
head(df,4)

##              datetime eventhour  work
## 1 2011-11-18 17:47:02        17  TRUE
## 2 2011-08-05 23:22:30        23 FALSE
## 3 2011-06-22 16:22:01        16  TRUE
## 4 2011-06-05 14:47:16        14  TRUE

# make our data into circular class from package circular
df$eventhour <- circular(df$eventhour%%24, # convert to 24 hrs
                         units="hours", 
                         template="clock24")

The package includes a rose diagram plot:

par(mfrow=c(1,2))
rose.diag(df$eventhour, bins = 24, col = "lightblue", main = "Events by Hour (sqrt scale)",
          radii.scale = "sqrt", prop=3)

rose.diag(df$eventhour, bin = 24, col = "lightblue", main = "Events by Hour (linear scale)", 
    prop = 12, radii.scale = "linear", shrink = 1.25, cex = 0.8, ticks=TRUE)

par(mfrow=c(1,1))

We can use a typical plot to show a density:

angles <- seq(0,2*pi,len=100)
data   <- dvonmises(circular(angles), mu=circular(pi/10), kappa=2)
plot(angles, data, type="l", col="blue")

And also draw a density in a circular plot:

ff <- function(x) dvonmises(x, mu=circular(pi/10), kappa=20)
curve.circular(ff, col="blue", shrink=1.5, lwd=2)

Or draw an histogram using the rose diagrams together with the sample datapoints:

set.seed(221)
rdata <- rvonmises(75, circular(3*pi/2), 2.5)
rose.diag(rdata, bins=50, prop=1.5, shrink=1)
points(rdata, pch=20, col="red")

The von Mises Distribution

The von Mises Distribution is a continuous probability distribution on the circle, it’s the circular analogue of the normal distribution. The function has two parameters, a centrality one, $\mu$, and a concentration parameter, $\kappa$ (the higher, the more concentraded is the probability mass over $\mu$).

This is its pdf:

\[f(x|\mu,\kappa) = \frac{e^{\kappa \cos(x-\mu)}}{2\pi I_0(\kappa)}\]

where \[I_0(\kappa) = \frac{1}{\pi} \int_0^\pi e^{\kappa \cos(x)} dx\]

The Bessel function $I_0(\kappa)$ can be computed by R function BesselI(kappa,0) or approximated like this:

I0 <- function(kappa) besselI(kappa,0)

# Approximates Bessel(x,nu)
#  increase iter for better approximation
# cf. functions.wolfram.com/Bessel-TypeFunctions/BesselI/introductions/Bessels/05/
approx.I0 <- function(x, nu=0, iter=20) {
  result <- 1
  for(k in 1:iter)
    result = result + (x/2)^(2*k+nu) / (gamma(k+nu+1)*factorial(k))
  result
}

kappa <- 10
approx.I0(kappa,0)
## [1] 2815.717
I0(kappa)
## [1] 2815.717
I.0(kappa)  # package 'circular' also has this function
## [1] 2815.717

The next code creates a random sample at mean $\pi/4$:

set.seed(101)
mu_real    <- pi/4
kappa_real <- 15
my_data    <- as.numeric(rvonmises(100, circular(mu_real), kappa_real))

Assuming the sample was generated by a von Mises distribution, we can easily compute the MLE estimation which is given by function mle.vonmises:

estimates <- mle.vonmises(as.circular(my_data))
estimates$mu[[1]]
## [1] 0.8028
estimates$kappa[[1]]
## [1] 13.8538

Let’s plot the estimation vs the real signal:

# plot two von Mises distributions
plot_circular_compare <- function(mu_real, kappa_real, mu_hat, kappa_hat, 
                                  shrink=1.75, col="red", add=FALSE) {
  ff.real     <- function(x) dvonmises(x, mu=circular(mu_real), kappa=kappa_real)
  ff.inferred <- function(x) dvonmises(x, mu=circular(mu_hat),  kappa=kappa_hat)
  curve.circular(ff.real,     col="blue", shrink=shrink, ylim=c(-0.75,1.25), lwd=2, add=add)
  curve.circular(ff.inferred, col=col,    lwd=2, add=T)
  legend("topleft",c("real","estimate"), col=c("blue",col), lwd=2, bty = "n")
}

plot_circular_compare(mu_real, kappa_real, estimates$mu[[1]], estimates$kappa[[1]])

The next sections show ways to solve this problem in a Bayesian setting using BUGS.

source("run_model.R") # import 'run.model()' to run bugs

BUGS model I: estimate mean direction with known concentration

The easier variant is when we know $\kappa$. The specification for this model in BUGS is:

modelString = "
  model {

      for (i in 1:n) {

        ## phi[i] <- -log(exp(kappa * cos(x[i]-mu)) / (2*pi*I0)) + C   pdf simplifies to:
        phi[i] <- - kappa * cos(x[i]-mu) + log(2*pi*I0) + C
        
        dummy[i] <- 0
        dummy[i] ~ dpois( phi[i] )
      }

      # Priors 

      mu    ~ dunif(0,pi2)

      # other values

      C   <- 1000000   # for the zero's trick
      pi  <- 3.14159
      pi2 <- 2*pi
  }
"

Since the von Mises is not a distribution available in BUGS, we implemented its likelihood using the zero’s trick.

With this model, we initialize parameter $\mu$ to its empirical mean (to help BUGS start with a good value), and run the model:

# initializations
genInitFactory <- function()  {
  i <- 0
  function() {
    i <<- i + 1
    list( 
      mu = mean(my_data) # start with the mean estimate
    ) 
  }
}

# Everything is ready. Run the model!
run.model(modelString, samples=c("mu"), 
          data=list(I0    = I0(kappa_real),
                    x     = my_data,
                    kappa = kappa_real,
                    n     = length(my_data)), 
          chainLength=5e3, init.func=genInitFactory())

## model is syntactically correct

## data loaded

## model compiled

## Initializing chain 1:

## model is initialized

## 500 updates took 0 s

## monitor set for variable 'mu'

## 5000 updates took 1 s

mu_hat <- samplesStats(c("mu"))[[1]]  # get point estimate of distribution's mean

# plot results
plot_circular_compare(mu_real, kappa_real, mu_hat, kappa_real)

BUGS model II: estimate mean direction using a MLE concentration

Assume we don’t know $\kappa$ but approach the problem in a mixed way, by feeding the previous model with the empirical estimation of $\kappa$ based on the available sample. Another way to estimate the concentration parameter $\kappa$ is to use function A1inv to find the max-likelihood estimate:

estimate_kappa <- function(data) {
  sum_sins <- sum(sin(data))
  sum_coss <- sum(cos(data))
  A1inv(mean(cos(data - atan2(sum_sins, sum_coss))))
}

So, being a simple extension of the previous model, would be to run it using this estimation of $\kappa$:

kappa_hat <- estimate_kappa(my_data)

run.model(modelString, samples=c("mu"), 
          data=list(I0    = I0(kappa_hat),
                    x     = my_data,
                    kappa = kappa_hat,
                    n     = length(my_data)), 
          chainLength=5e3, init.func=genInitFactory())

## model is syntactically correct

## data loaded

## model compiled

## Initializing chain 1:

## model is initialized

## 500 updates took 0 s

## monitor set for variable 'mu'

## 5000 updates took 0 s

mu_hat <- samplesStats(c("mu"))[[1]]  # get point estimate of distribution's mean

# plot results
plot_circular_compare(mu_real, kappa_real, mu_hat, kappa_hat)

BUGS model III: estimate mean direction and concentration

Here we will want to derive both parameters using BUGS.

The following solution is based on the fact that it is not easy to compute the Bessel function $I_0$ inside BUGS (which should be possible, but it is computationally expensive). So, we give BUGS a set of choices of $\kappa$ where the respective values $I_0(\kappa)$ were already computed in R.

The new model specification changed the log-likelihood. Now, instead of using a fixed $\kappa$ value, we use a variable kappa_hat which is computed via the random variable k following a categorical distribution over all possible pairs $\kappa, I_0(\kappa)$. These pairs initially start with equal probabilities.

modelString = "
  model {

      for (i in 1:n) {

        phi[i] <- - kappa_hat * cos(x[i]-mu) + log(2*pi*IO_hat) + C
        
        dummy[i] <- 0
        dummy[i] ~ dpois( phi[i] )
      }

      k  ~ dcat(p[])

      for(j in 1:n_kappas) {
        p[j] <- 1/n_kappas         # creating vector p = {1/n_kappas,...}
      }

      kappa_hat <- kappas[k]
      IO_hat    <- I0s[k]

      # Priors 

      mu ~ dunif(0,pi2)

      # other values

      C   <- 1000000   # for the zero's trick
      pi  <- 3.14159
      pi2 <- 2*pi
  }
"

Let’s give BUGS all values of $\kappa$ from 1 to 40:

n_kappas <- 40
kappas <- 1:n_kappas
I0_s   <- sapply(kappas, I0)

# Everything is ready. Run the model!
run.model(modelString, samples=c("mu","kappa_hat"), 
          data=list(x        = my_data,
                    n        = length(my_data),
                    kappas   = kappas, 
                    I0s      = I0_s, 
                    n_kappas = n_kappas), 
          chainLength=5e3)

## model is syntactically correct

## data loaded

## model compiled

## initial values generated, model initialized

## 500 updates took 0 s

## monitor set for variable 'mu'

## monitor set for variable 'kappa_hat'

## 5000 updates took 2 s

mu_hat    <- samplesStats(c("mu"))[[1]]  # point estimate of distribution's mean
kappa_hat <- samplesStats(c("kappa_hat"))[[1]]  # point estimate of distribution's concentration

# plot results
plot_circular_compare(mu_real, kappa_real, mu_hat, kappa_real)

Since we are making a Bayesian inference, we can also plot the uncertainty of these posterior values. First get the sample from the MCMC run:

mus    <- samplesSample("mu")
kappas <- samplesSample("kappa_hat")

Then plot a bunch of them:

ff.inferred <- function(x) dvonmises(x, mu=circular(mu_hat), kappa=kappa_hat)
curve.circular(ff.inferred, col="red", shrink=2.25, lwd=2) 
  
for (i in 1:250) {                                            # draw a number of samples
  ff.sample <- function(x) dvonmises(x, mu=circular(mus[i]), kappa=kappas[i])
  curve.circular(ff.sample, col="grey", lwd=2, add=T)
}

curve.circular(ff.inferred, col="red", lwd=2, add=T) # draw mean estimation

BUGS model IV: mixture of von Mises with known concentrations

The Bayesian framework can deal with mixtures in the same framework. Let’s see a possible solution for this type of problem. We will present an example with two distributions, but it can be generalized for more complex mixtures. This solution assumes we know both concentration parameters $\kappa_1, \kappa_2$.

Let’ create some random mix data:

set.seed(101)

mu1_real    <- pi/4  # not known
kappa1_real <- 10    # known

mu2_real    <- pi    # not known
kappa2_real <- 5     # known

my_data1 <- as.numeric(rvonmises(100, circular(mu1_real), kappa1_real))
my_data2 <- as.numeric(rvonmises(100, circular(mu2_real), kappa2_real))

my_mixdata <- c(my_data1, my_data2)

par(mfrow=c(1,2))
rose.diag(my_mixdata, bins=50, prop=2.5, col="lightgreen")
points(as.circular(my_data1), pch=20, col="darkgreen")
points(as.circular(my_data2), pch=20, col="lightgreen")

ff.mix1 <- function(x) dvonmises(x, mu=circular(mu1_real), kappa=kappa1_real)
ff.mix2 <- function(x) dvonmises(x, mu=circular(mu2_real), kappa=kappa2_real)
curve.circular(ff.mix1, xlim=c(-1.5,1.5), col="darkgreen", shrink=1.1, lwd=2)
curve.circular(ff.mix2, col="lightgreen", lwd=2, add=T)

par(mfrow=c(1,1))

This model is based on the BUGS code for mixture of gaussians:

modelString = "
  model {

      for (i in 1:n) {

        phi[i] <- - kappa[i] * cos(x[i]-mu[i]) + log(2*pi*I0[i]) + C

        mu[i]    <- lambdaMu[G[i]]
        kappa[i] <- lambdaKappa[G[i]]
        I0[i]    <- lambdaI0[G[i]]
        
        G[i]   ~  dcat(P[])     # the cluster attributions for each x[i]

        dummy[i] <- 0
        dummy[i] ~ dpois( phi[i] )
      }

      P[1:2] ~ ddirch(alpha[])  # dirichlet distribution (here, just 2 clusters)
      alpha[1] <- 0.5           
      alpha[2] <- 0.5              

      lambdaKappa[1] <- 10
      lambdaKappa[2] <- 5

      lambdaI0[1] <- 2815.717   # I_0(10)
      lambdaI0[2] <- 27.23987   # I_0(5)

      lambdaMu[1] ~ dunif(0,pi2)
      lambdaMu[2] ~ dunif(0,pi2)

      # other values

      C   <- 1000000   # for the zero's trick
      pi  <- 3.14159
      pi2 <- 2*pi
  }
"

The vector G estimates to which cluster a data point belongs to. If a datapoint $x$ is assigned to cluster $i$, then its log-likelihood is given by $f(x|\mu_i, \kappa_i)$. Vector $G$ is modeled by a dirichlet distribution, where its parameter, $\alpha$, must be a simplex (ie, a vector of non negatives values that sum to one).

Let’s run the model and extract the posterior means:

run.model(modelString, samples=c("lambdaMu","G"), 
          data=list(x = my_mixdata,
                    n = length(my_mixdata),
                    G = c(1, rep(NA,length(my_mixdata)-1))), # need to give one value (?)
          chainLength=1e4)

## model is syntactically correct

## data loaded

## model compiled

## initial values generated, model initialized

## 1000 updates took 0 s

## monitor set for variable 'lambdaMu'

## monitor set for variable 'G'

## 10000 updates took 7 s

mu_clusters <- samplesStats("lambdaMu")$mean
mu_clusters <- mu_clusters %% (2*pi)  # place values inside the [0,2*pi] interval

The next code compares the true mix with the estimated mix:

ff.real1 <- function(x) dvonmises(x, mu=circular(mu1_real),  kappa=kappa1_real)
ff.real2 <- function(x) dvonmises(x, mu=circular(mu2_real),  kappa=kappa2_real)
ff.real  <- function(x) ifelse(ff.real1(x)>ff.real2(x),ff.real1(x),ff.real2(x))
curve.circular(ff.real, col="blue", shrink=2.25, lwd=2) 

ff.inferred1 <- function(x) dvonmises(x, mu=circular(mu_clusters[1]),  kappa=kappa1_real)
curve.circular(ff.inferred1, col="red", lwd=2, add=T) 
ff.inferred2 <- function(x) dvonmises(x, mu=circular(mu_clusters[2]),  kappa=kappa2_real)
ff.inferred  <- function(x) ifelse(ff.inferred1(x)>ff.inferred2(x),ff.inferred1(x),ff.inferred2(x))
curve.circular(ff.inferred, col="red", lwd=2, add=T) 

legend("topleft",c("real","estimate"), col=c("blue","red"), lwd=2, bty = "n")

The MCMC also gives an attribution of each datapoint to each cluster. We can check how precise was this clustering. In the next code snippet, we see that each point has a value that determines how much nearer it is from cluster $1$ or $2$. This value can be interpreted (in a logistic regression fashion) as a probability of belong to either cluster. Herein, we will just apply a thresold at the middle value, $1.5$, to classify each datapoint.

We denote the two real distributions as ‘a’ and ‘b’, and compare them with the estimated cluster id of each sample:

cluster_id <- samplesStats("G")[,1]
cluster_id
##   [1] 1.000 1.003 1.002 1.002 1.000 1.000 1.000 1.000 1.000 1.000 1.529
##  [12] 1.000 1.000 1.000 1.000 1.000 1.002 1.000 1.000 1.000 1.000 1.000
##  [23] 1.000 1.000 1.000 1.000 1.002 1.000 1.002 1.001 1.000 1.000 1.002
##  [34] 1.001 1.001 1.000 1.000 1.000 1.002 1.000 1.002 1.000 1.000 1.000
##  [45] 1.000 1.000 1.000 1.001 1.000 1.001 1.001 1.000 1.000 1.000 1.000
##  [56] 1.000 1.001 1.000 1.000 1.000 1.000 1.000 1.002 1.001 1.002 1.000
##  [67] 1.000 1.000 1.059 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.002
##  [78] 1.000 1.000 1.001 1.001 1.000 1.000 1.000 1.000 1.000 1.001 1.001
##  [89] 1.002 1.000 1.000 1.001 1.002 1.000 1.002 1.000 1.000 1.015 1.000
## [100] 2.000 1.999 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000
## [111] 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000
## [122] 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000
## [133] 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000
## [144] 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 1.984
## [155] 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000
## [166] 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000
## [177] 2.000 2.000 2.000 2.000 2.000 2.000 2.000 1.976 2.000 2.000 2.000
## [188] 2.000 2.000 2.000 2.000 2.000 1.998 2.000 2.000 2.000 2.000 2.000
## [199] 2.000
estimated_id <- 0+(cluster_id > 1.5) # values are between 1 and 2, we split at the middle

real_id <- c(rep('a',100), rep('b',100))[-1] # remove the first value (was fixed in BUGS)
table(estimated_id, real_id)
##             real_id
## estimated_id   a   b
##            0  98   0
##            1   1 100

The classification error(s) were due to ‘outliers’ of the second distribution, which had a small $\kappa$ parameter, and so its values are more spread over the unit circle.

Let’s visualize where the error(s) are (the color square gives its real cluster id, and the color cross the estimated id):

mixdata_plot <- my_mixdata[-1]
rose.diag(my_mixdata, bins=50, prop=2.5, col="lightgreen",
          control.circle=circle.control(lty=0))
points(as.circular(mixdata_plot[real_id=='a']), col="orange", pch=0, lwd=1.5)
points(as.circular(mixdata_plot[real_id=='b']), col="purple", pch=0, lwd=1.5)

points(as.circular(mixdata_plot[estimated_id==0]), col="orange", pch=3, lwd=1.5)
points(as.circular(mixdata_plot[estimated_id==1]), col="purple", pch=3, lwd=1.5)

We see that the error is at one of the ‘frontiers’ between the influence of both von Mises distributions.

The Wrapped Cauchy Distribution

The Wrapped Cauchy Distribution is the circular analogue of the Cauchy function, which s quite useful for heavy tail modeling. It has two parameters, a centrality, $\mu$, and a concentration parameter, $\rho$ (the higher, the more concentraded is the probability mass over $\mu$).

Its pdf is given by

\[f(\theta | \mu, \rho) = \frac{1}{2 \pi} \frac{1-\rho^2}{1+\rho^2-2\cos(\theta-\mu)}, -\pi \le \theta \lt \pi\]

where $\mu \in [-\pi,\pi)$ and $\rho \in [0,1)$

The next code samples 100 data points from $f(\cdot | 0, 0.7)$

mu  <- 0
rho <- 0.7

set.seed(101)
rdata <- rwrappedcauchy(100, mu=circular(mu), rho=rho)
plot(rdata, xlim=c(-1,2), col="darkgreen", pch=20)
ff <- function(x) dwrappedcauchy(x, mu=circular(mu), rho=rho)
curve.circular(ff, col="blue", lwd=2, add=T)

Notice how the number of ‘outliers’ increase in this distribution.

The function pdf_cauchy computes the density just like circular package’s dwrappedcauchy:

# pre: theta in -pi <= mu < pi
# location mu, -pi <= mu < pi
# concentration rho, rho in [0,1)
pdf_cauchy <- function(theta, mu, rho) {
  1/(2*pi) * (1-rho^2) / (1+rho^2-2*rho*cos(theta-mu))
}

mu  <- as.circular(pi/4)
rho <- 0.3
xs  <- as.circular(seq(0,pi/2,len=7))

dwrappedcauchy(xs, mu, rho) 
## [1] 0.2175502 0.2539181 0.2837351 0.2955735 0.2837351 0.2539181 0.2175502
sapply(xs,function (x) pdf_cauchy(x,mu,rho))
## [1] 0.2175502 0.2539181 0.2837351 0.2955735 0.2837351 0.2539181 0.2175502

Now let’s see how to model this function in BUGS.

BUGS model I : estimate mean and concentration

This pdf is easier to represent in BUGS, we can directly estimate both parameters:

modelString = "
  model {

      for (i in 1:n) {

        phi[i] <- -log( (1-pow(rho,2)) / (2*pi*(1+pow(rho,2)-2*rho*cos(x[i]-mu))) ) + C

        dummy[i] <- 0
        dummy[i] ~ dpois( phi[i] )
      }

      # Priors 

      mu  ~ dunif(pi_1,pi)
      rho ~ dunif(0,1)

      # other values

      C    <- 1000000   # for the zero's trick
      pi   <- 3.14159
      pi_1 <- -pi
  }
"

Notice the weak priors over $\mu$ and $\rho$. As usual, if there is some previous information that would help us concentrate the prior mass, the results could be improved.

Now, let’s create a random dataset:

mu  <- as.circular(pi/4)
rho <- 0.7

set.seed(121)
n <- 100
my_data <- rwrappedcauchy(n, mu, rho)

rose.diag(my_data, bins=50, prop=1.5, shrink=1)
points(my_data, pch=20, col="darkgreen")

And apply the model to the dataset. Herein, we’ll use the MLE estimates to initialize the values of the parameters to help BUGS (since they are easy to compute with function mle.wrappedcauchy, but we could initialize to any reasonable value). We will also use these MLE values to compare with the MCMC results.

# pre: x>=0
# return circular value between -pi <= mu < pi
cauchy_process_val <- function(x) {
  while(x>pi)
    x = x-2*pi
  x
}

# the pdf requires values between -pi <= mu < pi
pp_my_data <- sapply(as.numeric(my_data), cauchy_process_val)

# MLE estimates
mle <- mle.wrappedcauchy(my_data)
mu_mle  <- as.numeric(mle$mu)
rho_mle <- as.numeric(mle$rho)

# initializations
genInitFactory <- function()  {
  i <- 0
  function() {
    i <<- i + 1
    list( 
      mu  = mu_mle,  # MLE estimates
      rho = rho_mle
    ) 
  }
}

# Everything is ready. Run the model!
run.model(modelString, samples=c("mu", "rho"), 
          data=list(x     = pp_my_data,
                    n     = length(pp_my_data)), 
          chainLength=2e4, init.func=genInitFactory())

## model is syntactically correct

## data loaded

## model compiled

## Initializing chain 1:

## model is initialized

## 2000 updates took 0 s

## monitor set for variable 'mu'

## monitor set for variable 'rho'

## 20000 updates took 3 s

samplesStats(c("mu"))

##      mean      sd  MC_error val2.5pc median val97.5pc start sample
## mu 0.7324 0.05387 0.0009973   0.6238 0.7334    0.8348  2001  20000

samplesStats(c("rho"))

##       mean      sd  MC_error val2.5pc median val97.5pc start sample
## rho 0.6986 0.03523 0.0006057   0.6238 0.7003    0.7624  2001  20000

# keep point estimates
mu_hat  <- samplesStats(c("mu"))[[1]]  # get point estimate of distribution's mean
rho_hat <- samplesStats(c("rho"))[[1]]  # get point estimate of distribution's mean

We can now plot the results:

# plot results
plot(my_data, shrink=1.5, col="darkgreen", pch=20)

ff.real <- function(x) dwrappedcauchy(x, mu=circular(mu), rho=rho)
curve.circular(ff.real, col="blue", lwd=2, add=T)

ff.mle <- function(x) dwrappedcauchy(x, mu=circular(mu_mle), rho=rho_mle)
curve.circular(ff.mle, col="orange", lwd=2, add=T)

ff.inferred <- function(x) dwrappedcauchy(x, mu=circular(mu_hat), rho=rho_hat)
curve.circular(ff.inferred, col="red", lwd=2, add=T)
legend("topleft",c("real", "MLE", "MCMC"), col=c("blue","orange", "red"), lwd=2, bty = "n")

Again, we could also check the uncertainty over the mean estimate:

mus  <- samplesSample("mu")
rhos <- samplesSample("rho")

ff.inferred <- function(x) dwrappedcauchy(x, mu=circular(mu_hat), rho=rho_hat)
curve.circular(ff.inferred, col="red", shrink=1.5, lwd=2) 
  
for (i in 1:500) {                                         # draw a number of samples
  ff.sample <- function(x) dwrappedcauchy(x, mu=circular(mus[i]), rho=rhos[i])
  curve.circular(ff.sample, col="grey", lwd=2, add=T)
}

curve.circular(ff.inferred, col="red", lwd=2, add=T) # draw mean estimation

BUGS model II: mixture of wrapped Cauchys

Make a mixture:

set.seed(101)

mu1_real  <- pi/4   # not known
rho1_real <- 0.65   # not known

mu2_real  <- -pi/2  # not known
rho2_real <- 0.75   # not known

my_data1 <- as.numeric(rwrappedcauchy(100, circular(mu1_real), rho1_real))
my_data2 <- as.numeric(rwrappedcauchy(100, circular(mu2_real), rho2_real))

my_mixdata <- c(my_data1, my_data2)

par(mfrow=c(1,2))
rose.diag(my_mixdata, bins=50, prop=2.5, col="lightgreen")
points(as.circular(my_data1), pch=20, col="darkgreen")
points(as.circular(my_data2), pch=20, col="lightgreen")

ff.mix1 <- function(x) dwrappedcauchy(x, mu=circular(mu1_real), rho=rho1_real)
ff.mix2 <- function(x) dwrappedcauchy(x, mu=circular(mu2_real), rho=rho2_real)
curve.circular(ff.mix1, xlim=c(-1.9,1.1), col="darkgreen", shrink=1.1, lwd=2)
curve.circular(ff.mix2, col="lightgreen", lwd=2, add=T)

The model is very similar to the previous mixture for two von Mises distributions. We just added the concentration parameter $\rho$ as another parameter for the MCMC to estimate:

modelString = "
  model {

      # Log-Likelihood function 

      for (i in 1:n) {

        phi[i] <- -log( (1-pow(rho[i],2)) /
                        (2*pi*(1+pow(rho[i],2)-2*rho[i]*cos(x[i]-mu[i]))) ) + C

        mu[i]  <- lambdaMu[G[i]]
        rho[i] <- lambdaRho[G[i]]

        G[i]   ~  dcat(P[])     # the cluster attributions for each x[i]

        dummy[i] <- 0
        dummy[i] ~ dpois( phi[i] )
      }

      P[1:2] ~ ddirch(alpha[])  # dirichlet distribution (here, just 2 clusters)
      alpha[1] <- 0.5           
      alpha[2] <- 0.5              

      lambdaMu[1] ~ dunif(pi_1,pi)
      lambdaMu[2] ~ dunif(pi_1,pi)

      lambdaRho[1] ~ dunif(0,1)
      lambdaRho[2] ~ dunif(0,1)

      # other values

      C   <- 1000000   # for the zero's trick
      pi  <- 3.14159
      pi_1 <- -pi
  }
"

Now let’s run the model:

pp_my_mixdata <- sapply(as.numeric(my_mixdata), cauchy_process_val)

run.model(modelString, samples=c("lambdaMu","lambdaRho","G"), 
          data=list(x = my_mixdata,
                    n = length(my_mixdata),
                    G = c(1, rep(NA,length(my_mixdata)-1))),
          chainLength=1e4)

## model is syntactically correct

## data loaded

## model compiled

## initial values generated, model initialized

## 1000 updates took 0 s

## monitor set for variable 'lambdaMu'

## monitor set for variable 'lambdaRho'

## monitor set for variable 'G'

## 10000 updates took 10 s

mu_clusters  <- samplesStats("lambdaMu")$mean
rho_clusters <- samplesStats("lambdaRho")$mean

Let’s compare the true and estimated mixtures:

ff.real1 <- function(x) dwrappedcauchy(x, mu=circular(mu1_real),  rho=rho1_real)
ff.real2 <- function(x) dwrappedcauchy(x, mu=circular(mu2_real),  rho=rho2_real)
ff.real  <- function(x) ifelse(ff.real1(x)>ff.real2(x),ff.real1(x),ff.real2(x))
curve.circular(ff.real, col="blue", shrink=2.25, lwd=2) 

ff.inferred1 <- function(x) dwrappedcauchy(x, mu=circular(mu_clusters[1]),  rho=rho_clusters[1])
ff.inferred2 <- function(x) dwrappedcauchy(x, mu=circular(mu_clusters[2]),  rho=rho_clusters[2])
ff.inferred  <- function(x) ifelse(ff.inferred1(x)>ff.inferred2(x),ff.inferred1(x),ff.inferred2(x))
curve.circular(ff.inferred, col="red", lwd=2, add=T) 

legend("topleft",c("real","estimate"), col=c("blue","red"), lwd=2, bty = "n")

And, as we did with the von Mises case, let’s check the cluster attribution. Before that, just notice that the classification error must be higher, since there are much more datapoints from one distribution in the density peak area of the other.

cluster_id <- samplesStats("G")[,1]
cluster_id
##   [1] 1.025 1.075 1.130 1.050 1.057 1.546 1.082 1.238 1.762 1.034 1.072
##  [12] 1.059 1.026 1.632 1.483 1.038 1.038 1.195 1.026 1.076 1.025 1.037
##  [23] 1.127 1.034 1.040 1.027 1.145 1.191 1.130 1.249 1.065 1.040 1.036
##  [34] 1.055 1.033 1.041 1.040 1.026 1.027 1.331 1.391 1.044 1.081 1.145
##  [45] 1.041 1.030 1.029 1.043 1.026 1.026 1.322 1.089 1.064 1.575 1.044
##  [56] 1.097 1.226 1.083 1.497 1.041 1.038 1.303 1.026 1.036 1.068 1.029
##  [67] 1.067 1.030 1.028 1.041 1.028 1.860 1.032 1.034 1.050 1.025 1.046
##  [78] 1.112 1.030 1.091 1.041 1.039 1.027 1.026 1.030 1.078 1.283 1.034
##  [89] 1.025 1.135 1.031 1.071 1.033 1.051 1.040 1.132 1.043 1.029 1.028
## [100] 1.947 1.962 1.526 1.942 1.887 1.744 1.951 1.956 1.918 1.959 1.932
## [111] 1.954 1.472 1.950 1.945 1.952 1.948 1.174 1.739 1.706 1.498 1.943
## [122] 1.951 1.347 1.870 1.937 1.281 1.148 1.809 1.946 1.793 1.637 1.888
## [133] 1.817 1.421 1.392 1.237 1.954 1.077 1.834 1.953 1.855 1.960 1.793
## [144] 1.949 1.922 1.069 1.790 1.958 1.181 1.938 1.030 1.717 1.953 1.946
## [155] 1.898 1.375 1.917 1.346 1.944 1.956 1.865 1.948 1.425 1.816 1.953
## [166] 1.836 1.953 1.955 1.652 1.830 1.959 1.205 1.903 1.950 1.959 1.917
## [177] 1.938 1.954 1.835 1.951 1.915 1.807 1.879 1.684 1.958 1.893 1.949
## [188] 1.690 1.915 1.907 1.963 1.766 1.864 1.949 1.338 1.854 1.948 1.914
## [199] 1.026
estimated_id <- 0+(cluster_id > 1.5) # values are between 1 and 2, we split at the middle

real_id <- c(rep('a',100), rep('b',100))[-1] # remove the first value (was fixed in BUGS)
table(estimated_id, real_id)
##             real_id
## estimated_id  a  b
##            0 94 19
##            1  5 81

Let’s visualize where the errors are (again, the color square gives its real cluster id, and the color cross the estimated id):

mixdata_plot <- my_mixdata[-1]
rose.diag(my_mixdata, bins=50, prop=2.5, col="lightgreen",
          control.circle=circle.control(lty=0))
points(as.circular(mixdata_plot[real_id=='a']), col="orange", pch=0, lwd=1.5)
points(as.circular(mixdata_plot[real_id=='b']), col="purple", pch=0, lwd=1.5)

points(as.circular(mixdata_plot[estimated_id==0]), col="orange", pch=3, lwd=1.5)
points(as.circular(mixdata_plot[estimated_id==1]), col="purple", pch=3, lwd=1.5)

BUGS model III: mixture of von Mises and wrapped Cauchy

In this section we will make a model to infer the parameters of a mixed signal from two different sources, one a von Mises distribution, another a wrapped Cauchy.

set.seed(121)

mu1_real  <- pi/8   # not known
kappa1_real <- 25   # known

mu2_real  <- -pi/8  # not known
rho2_real <- 0.9    # not known

my_data1 <- as.numeric(rvonmises(100,      circular(mu1_real), kappa1_real))
my_data2 <- as.numeric(rwrappedcauchy(100, circular(mu2_real), rho2_real))

my_mixdata <- c(my_data1, my_data2)

par(mfrow=c(1,2))
rose.diag(my_mixdata, bins=50, prop=2.5, col="lightgreen")
points(as.circular(my_data1), pch=20, col="darkgreen")
points(as.circular(my_data2), pch=20, col="lightgreen")

ff.mix1 <- function(x) dvonmises(x,      mu=circular(mu1_real), kappa=kappa1_real)
ff.mix2 <- function(x) dwrappedcauchy(x, mu=circular(mu2_real), rho=rho2_real)
curve.circular(ff.mix1, xlim=c(-0.9,2.1), col="darkgreen", shrink=1.5, lwd=2)
curve.circular(ff.mix2, col="lightgreen", lwd=2, add=T)

par(mfrow=c(1,1))

We assume G[i]==1 if datapoint $x_i$ was produced by the von Mises, and G[i]==2 if produce by the wrapped Cauchy. The mixed likelihood is given as follows:

\[f(x_i|\mu_{VM},\kappa,\mu_{WC},\rho) = (G[i]==1)*f_{VM}(x_i|\mu_{VM},\kappa) + (G[i]==2)*f_{WC}(x_i|\mu_{WC},\rho)\]

where $f_{VM}$ and $f_{WC}$ are the respective pdf’s for the von Mises and wrapped Cauchy distributions.

We use BUGS’ equals function to implement this mixed likelihood. The value of equals(x,y) is $1$ when x and y are equal, or $0$ otherwise.

modelString = "
  model {

      for (i in 1:n) {

        phi[i] <- equals(G[i],1) *                  # von Mises pdf
                    (
                      - kappa[i] * cos(x[i]-mu[i]) + log(2*pi*I0[i])
                    )
                  +
                  equals(G[i],2) *                  # wrapped Cauchy
                    ( -log( (1-pow(rho[i],2)) /
                            (2*pi*(1+pow(rho[i],2)-2*rho[i]*cos(x[i]-mu[i]))) )
                    )
                  + C

        mu[i]    <- lambdaMu[G[i]]
        kappa[i] <- lambdaKappa[G[i]]
        I0[i]    <- lambdaI0[G[i]]
        rho[i]   <- lambdaRho[G[i]]
        
        G[i] ~ dcat(P[])           # the cluster attributions for each x[i]

        dummy[i] <- 0
        dummy[i] ~ dpois( phi[i] )
      }

      P[1:2] ~ ddirch(alpha[])     # dirichlet distribution (here, just 2 clusters)
      alpha[1] <- 0.5           
      alpha[2] <- 0.5              

      lambdaMu[1] ~ dunif(0,pi2)
      lambdaMu[2] ~ dunif(pi_1,pi)

      lambdaKappa[1] <- 25         # known concentration for the von Mises
      lambdaKappa[2] <- 100        # the second cluster does not have a kappa
                                   # but a value must be given, so let's choose a high
                                   # concentration to make it harder to be selected

      lambdaI0[1] <- 5774560606.0  # I_0(25)
      lambdaI0[2] <- 1.0E+42       # I_0(100)

      lambdaRho[1] <- 0.999999     # the first cluster does not have a rho parameter
                                   # but a value must be given, so let's choose a high
                                   # concentration to make it harder to be selected
      lambdaRho[2] ~ dunif(0,1)

      # other values

      C    <- 1000000   # for the zero's trick
      pi   <- 3.14159
      pi2  <- 2*pi  
      pi_1 <- -pi
  }
"

# this rotation is irrelevant for the von Mises inference
pp_my_mixdata <- sapply(as.numeric(my_mixdata), cauchy_process_val)

run.model(modelString, samples=c("lambdaMu","lambdaRho","G"), 
          data=list(x = pp_my_mixdata,
                    n = length(pp_my_mixdata),
                    G = c(1, rep(NA,length(pp_my_mixdata)-1))),
          chainLength=1e4)

## model is syntactically correct

## data loaded

## model compiled

## initial values generated, model initialized

## 1000 updates took 1 s

## monitor set for variable 'lambdaMu'

## monitor set for variable 'lambdaRho'

## monitor set for variable 'G'

## 10000 updates took 16 s

mu_clusters <- samplesStats("lambdaMu")$mean
mu_clusters <- sapply(mu_clusters, cauchy_process_val) # place the mu's between [-pi,pi]

vonMises_mu <- mu_clusters[1]                          # get parameter's point estimates
wrapped_mu  <- mu_clusters[2]
wrapped_rho <- samplesStats("lambdaRho")$mean

Let’s compare the real signal with the estimated one:

ff.real1 <- function(x) dvonmises     (x, mu=circular(mu1_real),  kappa=kappa1_real)
ff.real2 <- function(x) dwrappedcauchy(x, mu=circular(mu2_real),  rho=rho2_real)
ff.real  <- function(x) ifelse(ff.real1(x)>ff.real2(x),ff.real1(x),ff.real2(x))
curve.circular(ff.real, col="blue", shrink=2.5, lwd=2) 

ff.inferred1 <- function(x) dvonmises     (x, mu=circular(vonMises_mu),  kappa=kappa1_real)
ff.inferred2 <- function(x) dwrappedcauchy(x, mu=circular(wrapped_mu), rho=wrapped_rho)
ff.inferred  <- function(x) ifelse(ff.inferred1(x)>ff.inferred2(x),ff.inferred1(x),ff.inferred2(x))
curve.circular(ff.inferred, col="red", lwd=2, add=T) 

legend("topleft",c("real","estimate"), col=c("blue","red"), lwd=2, bty = "n")

And let’s classify the datapoints:

cluster_id <- samplesStats("G")[,1]
cluster_id
##   [1] 1.027 1.022 1.027 1.025 1.027 1.047 1.034 1.028 1.028 1.059 1.110
##  [12] 1.023 1.879 1.936 1.287 1.028 1.029 1.025 1.041 1.021 1.032 1.028
##  [23] 1.038 1.050 1.026 1.022 1.986 1.037 1.036 1.035 1.027 1.079 1.030
##  [34] 1.058 1.034 1.103 1.027 1.077 1.024 1.023 1.037 1.378 1.028 1.026
##  [45] 1.030 1.032 1.037 1.057 1.036 1.050 1.029 1.032 1.119 1.422 1.055
##  [56] 1.031 1.027 1.021 1.145 1.069 1.023 1.170 1.044 1.074 1.024 1.022
##  [67] 1.023 1.030 1.108 1.060 1.023 1.084 1.028 1.119 1.040 1.028 1.034
##  [78] 1.039 1.148 1.029 1.313 1.045 1.024 1.035 1.039 1.030 1.023 1.019
##  [89] 1.032 1.040 1.056 1.028 1.072 1.030 1.049 1.054 1.027 1.024 1.087
## [100] 1.999 2.000 2.000 2.000 2.000 1.041 1.994 1.028 1.995 1.997 1.034
## [111] 1.262 1.997 2.000 1.999 1.985 2.000 1.030 2.000 2.000 1.950 1.999
## [122] 1.995 1.999 2.000 1.990 2.000 2.000 2.000 2.000 2.000 2.000 2.000
## [133] 2.000 2.000 1.069 2.000 2.000 2.000 2.000 1.998 2.000 1.021 2.000
## [144] 1.998 2.000 1.996 1.969 2.000 2.000 2.000 1.998 2.000 2.000 2.000
## [155] 2.000 1.946 2.000 1.999 1.978 1.998 2.000 1.956 1.946 2.000 2.000
## [166] 1.990 2.000 1.128 1.986 2.000 2.000 2.000 2.000 1.992 2.000 2.000
## [177] 1.998 1.998 1.950 2.000 1.207 2.000 2.000 1.997 1.465 2.000 2.000
## [188] 2.000 1.967 2.000 1.999 2.000 1.751 1.997 2.000 2.000 1.999 2.000
## [199] 2.000

estimated_id <- 0+(cluster_id > 1.5) # values are between 1 and 2, we split at the middle

real_id <- c(rep('a',100), rep('b',100))[-1] # remove the first value (was fixed in BUGS)
table(estimated_id, real_id)
##             real_id
## estimated_id  a  b
##            0 96 10
##            1  3 90

Again, the color square gives its real cluster id, and the color cross the estimated id:

mixdata_plot <- my_mixdata[-1]
rose.diag(my_mixdata, bins=50, prop=2.5, col="lightgreen",
          control.circle=circle.control(lty=0))
points(as.circular(mixdata_plot[real_id=='a']), col="orange", pch=0, lwd=1.5)
points(as.circular(mixdata_plot[real_id=='b']), col="purple", pch=0, lwd=1.5)

points(as.circular(mixdata_plot[estimated_id==0]), col="orange", pch=3, lwd=1.5)
points(as.circular(mixdata_plot[estimated_id==1]), col="purple", pch=3, lwd=1.5)

Jones and Pewsey distribution

Ref:

Arthur Pewsey et al., Circular Statistics in R; google books

This distribution has several known circular distributions as special cases.

Its density is proportional to

\[f(x|\mu,\kappa,\psi) \propto (1+\tanh(\kappa \psi) \cos(x-\mu))^{1/\psi}\]

with $\mu-\pi \lt x \leq \mu+\pi$, $\kappa \geq 0$, $

However the normalizing constant is more complex (details at their paper, Jones and Pewsey: Symmetric Distributions on the Circle). The next function computes it:

# compute the normalization constact of the Jones and Pewsey distribution
# ref:   Circular Statistics in R, 
# check: https://circstatinr.st-andrews.ac.uk/resources/Chap4-RCommands.txt
JPNCon <- function(kappa, psi){
  ncon<-0
  
  if (kappa < 0.001)
    ncon <- 1/(2*pi) 
  else {
    eps <- 10 * .Machine$double.eps # the smallest positive floating-point x such that 1+x!=1
    if (abs(psi) <= eps)
      ncon <- 1/(2*pi*I.0(kappa))
    else {
      intgrnd <- function(x){ (cosh(kappa*psi)+sinh(kappa*psi)*cos(x))**(1/psi) }
      
      tryCatch(
        {
          ncon <- 1/integrate(intgrnd, lower=-pi, upper=pi)$value
        },
        error = function(c) {
          # some pairs gave integration error at zero (?), so I split the integration in two
          # however this does not solve all the problems...
          sum1 <- integrate(intgrnd, lower=-pi, upper=0)$value
          sum2 <- integrate(intgrnd, lower=0, upper=pi)$value
          ncon <<- 1/(sum1+sum2)  # update variable from parent's environment
        })
    } 
  }
  
  finally = {  # return whether there was an error or not
    return(ncon)
  }
}

JPNCon(kappa=2,psi=0)
## [1] 0.0698175
JPNCon(kappa=5,psi=-5) 
## [1] 16.29323

The circular package implements this distribution’s density.

Let’s see some special cases:

ff <- function(x) djonespewsey(x, mu=0, kappa=0, psi=0)
curve.circular(ff, col="blue", lwd=2, shrink = 2, 
               main=expression(paste(kappa , " = 0 is the Uniform")))

ff <- function(x) djonespewsey(x, mu=0, kappa=5, psi=0.01)
curve.circular(ff, col="blue", lwd=2, shrink = 2, 
               main=expression(paste(psi," -> 0 approaches the von Mises")))

ff <- function(x) djonespewsey(x, mu=0, kappa=5, psi=1)
curve.circular(ff, col="blue", lwd=2, shrink = 2, 
               main=expression(paste(psi," = 0 is the Cardoid distribution")))

ff <- function(x) djonespewsey(x, mu=0, kappa=2, psi=-1)
curve.circular(ff, col="blue", lwd=2, shrink = 2, 
               main=expression(paste(psi," = -1 is the wrapped Cauchy")))

ff <- function(x) djonespewsey(x, mu=0, kappa=10, psi=10)
curve.circular(ff, col="blue", lwd=2, shrink = 2, 
               main=expression(paste(psi," > 0 & ", kappa , ">>0 is the Cartwright's Power-of-Cosine")))

ff <- function(x) djonespewsey(x, mu=0, kappa=10, psi=-3)
curve.circular(ff, col="blue", lwd=2, shrink = 2,
               main=expression(paste(psi," < -2 & ", kappa , ">>0 is a new distribution")))

The package circular does not provide a sampling function. The next code uses the method of acceptance-rejection to sample from it:

# ref:   Circular Statistics in R, 
# check: https://circstatinr.st-andrews.ac.uk/resources/Chap4-RCommands.txt
rjonespewsey <- function(n, mu, kappa, psi) {

  fmax  <- djonespewsey(mu, as.circular(mu), kappa, psi)
  theta <- 0
  
  for (j in 1:n) {
    stopgo <- FALSE
    while (!stopgo) {
      u1 <- runif(1, 0, 2*pi)
      pdfu1 <- djonespewsey(u1, as.circular(mu), kappa, psi)
      u2 <- runif(1, 0, fmax)
      if (u2 <= pdfu1) { 
        theta[j] <- u1 
        stopgo <- TRUE 
      }
    }
  }
  
  as.circular(theta)
}

# Eg of use:
set.seed(121)
mu <- pi/4; kappa <- 2; psi <- -3/4
rdata <- rjonespewsey(100, mu=mu, kappa=kappa, psi=psi)
rose.diag(rdata, bins=50, prop=1.5, shrink=1.5)
points(rdata, pch=20, col="darkgreen")
ff <- function(x) djonespewsey(x, mu=mu, kappa=kappa, psi=psi)
curve.circular(ff, col="blue", lwd=2, add=T)

Concerning circular::djonespewsey, I did find some integration problems for specific values of ($\mu, \kappa, \psi$), like djonespewsey(0,0,4.63,-1.61) produces the error Error in integrate(ker, 0, 2 * pi) : the integral is probably divergent.

The package implementation is similar to this one:

djonespewsey2 <- function(x, mu, kappa, psi) {
  ker <- function(x) {
    (cosh(kappa * psi) + sinh(kappa * psi) * cos(x - mu))^(1/psi)/
      (2 * pi * cosh(kappa * psi))
  }
  
  normalization <- integrate(ker, 0, 2 * pi)$value
  ker(x)/normalization
}

djonespewsey(0,0,4.63,-1.61)

## Error in integrate(ker, 0, 2 * pi): the integral is probably divergent

djonespewsey2(0,0,4.63,-1.61)

## Error in integrate(ker, 0, 2 * pi): the integral is probably divergent

So, here’s a more stable density approximation without using integrate:

my_djonespewsey <- function(x, mu, kappa, psi) {
  ker <- function(x) {
    (cosh(kappa * psi) + sinh(kappa * psi) * cos(x - mu))^(1/psi)/
      (2 * pi * cosh(kappa * psi))
  }
  
  dx = 0.01
  xs <- seq(0,2*pi,dx)
  normalization <- sum(ker(xs))*dx
  ker(x)/normalization
}

my_djonespewsey(0,0,4.63,-1.61)

## [1] 57.11161

Just an eg that the stable variant produces similar values to the package implementation:

mu <- -1; kappa <- 10; psi <- 1

xs <- seq(-pi,pi,len=100)
plot(xs, djonespewsey(xs,mu,kappa,psi), col="blue", lwd=7, type="l")
points(xs, my_djonespewsey(xs,mu,kappa,psi), col="orange", lwd=2, type="l")

Stan Model

Due to the difficult normalization constant, I didn’t find a practical way to model this distribution on BUGS. The only possibility I see is to provide BUGS with a grid of values $(\kappa, \psi)$ computed by function JPNCon, and let BUGS chose the best pair value…

That way, I used Stan to implement the previous stable density, and so we can infer the parameters of this distribution, like we did previously.

library(rstan)

This is the model in Stan:

model <- '
  functions {

     // debug purposes
     // void print_xs(real[] xs) {
     //   int size_xs;

     //   size_xs <- size(xs);
     //   for(i in 1:size_xs) {
     //      print(xs[i])
     //   }
     // }

     real kernel(real x, real mu, real kappa, real psi) {

        return (cosh(kappa * psi) + sinh(kappa * psi) * cos(x - mu))^(1/psi) /
               (2 * pi() * cosh(kappa * psi));
     }

     real normalization(real dx, real[] xs, real mu, real kappa, real psi) {

        real value;
        int size_xs;

        size_xs <- size(xs);
        value   <- 0;
        for (i in 1:size_xs) {
           value <- value + kernel(xs[i], mu, kappa, psi);
        }
        return value*dx;
     }

     real JonesPewsey_log(real[] D, int N, real dx, real[] xs, 
                          real mu, real kappa, real psi) {

        real log_lik;
        real log_pdf_const;

        log_pdf_const <- log( normalization(dx, xs, mu, kappa, psi) );
        log_lik <- 0;
        for(i in 1:N) {
          //log_lik <- log_lik + log( kernel(D[i], mu, kappa, psi)/pdf_const );
          log_lik <- log_lik + log( kernel(D[i], mu, kappa, psi) ) - log_pdf_const;
        }

        return log_lik;
     }
  }

  data {
    int<lower=0> N;   // number of observed data points
    real D[N];         // the observed data
  }

  transformed data {
 
    real<lower=0> dx; // should be a power of 10: 0.01, 0.001, ...
    real xs[100];     // should have size 1/dx

    dx <- 0.01;       // step to approximate integral with sums

    xs[1] <- 0;       // a grid of points between [0,2pi]
    for(i in 2:100)  {
      xs[i] <- xs[i-1] + 2*pi()*dx;
    }
    // print_xs(xs) // debug: to check if xs was correctly constructed
  }

  parameters {
    real<lower=-pi(), upper=pi()> mu;     // mu over the unit circle
    real<lower= 0,    upper=5>    kappa;  // kappa >= 0
    real<lower=-10,  upper=10>    psi;    // limited to [-10,10]
  }

  model {

    // priors
    mu    ~ uniform(-pi(),pi());   
    kappa ~ uniform( 0,5);
    psi ~ uniform(-10,10);                // behaves better than psi ~ cauchy(0,10);

    // likelihood
    D ~ JonesPewsey(N, dx, xs, mu, kappa, psi);   
  }
'

So, let’s try to fit some circular data samples. The next function receives parameter values for the Jones and Pewsey distribution, generates a ramdon sample from it, and then fit this sample (assuming the parameter values were unknown), and then plot the true distribution against the estimated one (ie, the distribution which parameter values are the posterior means):

# place all angles between 0 <= mu < 2*pi
normalize_angles <- function(x) {
  while(x>2*pi)
    x = x-2*pi
  x
}

fit_and_plot <- function(mu, kappa, psi, N=75, iter=5000) {
  
  # step 1: create a random sample D
  set.seed(121)
  D <- as.numeric(rjonespewsey(N, mu=mu, kappa=kappa, psi=psi))
  D <- sapply(D, normalize_angles) # values between 0 <= mu < 2*pi
  
  # step 2: fit D using Stan's model
  fit  <- stan(model_code = model, data = list(D=D, N=N),  iter = 1000, chains = 1, verbose = FALSE)
  fit2 <- stan(fit = fit,          data = list(D=D, N=N),  iter = iter, chains = 2, 
             verbose = FALSE, seed=101, warmup=1000)
  print(fit2)
  
  # step 2a: extract posterior means
  la <- extract(fit2, permuted = TRUE) 
  mu_hat    <- mean(la$mu)
  kappa_hat <- mean(la$kappa)
  psi_hat   <- mean(la$psi)

  # step 3: plot
  # left panel
  par(mfrow=c(1,2))
  xs <- seq(-pi,pi,len=100)
  plot(xs, my_djonespewsey(xs,mu,kappa,psi), col="blue", lwd=4, type="l",
       xlab="x", ylab="density")
  points(xs, my_djonespewsey(xs,mu_hat,kappa_hat,psi_hat), col="red", lwd=2, type="l")
  # right panel
  plot(as.circular(D), shrink=1.25, col="darkgreen", pch=20)
  ff     <- function(x) my_djonespewsey(x, mu=mu, kappa=kappa, psi=psi)
  curve.circular(ff, col="blue", lwd=4, add=T)
  ff_hat <- function(x) my_djonespewsey(x, mu=mu_hat, kappa=kappa_hat, psi=psi_hat)
  curve.circular(ff_hat, col="red", lwd=2, add=T)
  legend("topleft",c("real","estimate"), col=c("blue","red"), lwd=2, bty = "n")
  par(mfrow=c(1,1))
  
  # return true parameters and point estimates
  data.frame(true=c(mu, kappa, psi), estimate=c(mu_hat, kappa_hat, psi_hat))
}

Fitting a distribution close to a von Mises:

fit_and_plot(mu=pi/4,  kappa=4, psi=1/8, iter=2500)

## 
## SAMPLING FOR MODEL '23368f84e08f2ce0726de8466b4ef4d3' NOW (CHAIN 1).
## 
## Chain 1, Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 1, Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 1, Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 1, Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 1, Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 1, Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 1, Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 1, Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 1, Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 1, Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 1, Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 1, Iteration: 1000 / 1000 [100%]  (Sampling)# 
## #  Elapsed Time: 0.733 seconds (Warm-up)
## #                0.392 seconds (Sampling)
## #                1.125 seconds (Total)
## # 
## 
## SAMPLING FOR MODEL '23368f84e08f2ce0726de8466b4ef4d3' NOW (CHAIN 1).
## 
## Chain 1, Iteration:    1 / 2500 [  0%]  (Warmup)
## Chain 1, Iteration:  250 / 2500 [ 10%]  (Warmup)
## Chain 1, Iteration:  500 / 2500 [ 20%]  (Warmup)
## Chain 1, Iteration:  750 / 2500 [ 30%]  (Warmup)
## Chain 1, Iteration: 1000 / 2500 [ 40%]  (Warmup)
## Chain 1, Iteration: 1001 / 2500 [ 40%]  (Sampling)
## Chain 1, Iteration: 1250 / 2500 [ 50%]  (Sampling)
## Chain 1, Iteration: 1500 / 2500 [ 60%]  (Sampling)
## Chain 1, Iteration: 1750 / 2500 [ 70%]  (Sampling)
## Chain 1, Iteration: 2000 / 2500 [ 80%]  (Sampling)
## Chain 1, Iteration: 2250 / 2500 [ 90%]  (Sampling)
## Chain 1, Iteration: 2500 / 2500 [100%]  (Sampling)# 
## #  Elapsed Time: 1.081 seconds (Warm-up)
## #                0.955 seconds (Sampling)
## #                2.036 seconds (Total)
## # 
## 
## SAMPLING FOR MODEL '23368f84e08f2ce0726de8466b4ef4d3' NOW (CHAIN 2).
## 
## Chain 2, Iteration:    1 / 2500 [  0%]  (Warmup)
## Chain 2, Iteration:  250 / 2500 [ 10%]  (Warmup)
## Chain 2, Iteration:  500 / 2500 [ 20%]  (Warmup)
## Chain 2, Iteration:  750 / 2500 [ 30%]  (Warmup)
## Chain 2, Iteration: 1000 / 2500 [ 40%]  (Warmup)
## Chain 2, Iteration: 1001 / 2500 [ 40%]  (Sampling)
## Chain 2, Iteration: 1250 / 2500 [ 50%]  (Sampling)
## Chain 2, Iteration: 1500 / 2500 [ 60%]  (Sampling)
## Chain 2, Iteration: 1750 / 2500 [ 70%]  (Sampling)
## Chain 2, Iteration: 2000 / 2500 [ 80%]  (Sampling)
## Chain 2, Iteration: 2250 / 2500 [ 90%]  (Sampling)
## Chain 2, Iteration: 2500 / 2500 [100%]  (Sampling)# 
## #  Elapsed Time: 1.091 seconds (Warm-up)
## #                0.916 seconds (Sampling)
## #                2.007 seconds (Total)
## # 
## Inference for Stan model: 23368f84e08f2ce0726de8466b4ef4d3.
## 2 chains, each with iter=2500; warmup=1000; thin=1; 
## post-warmup draws per chain=1500, total post-warmup draws=3000.
## 
##        mean se_mean   sd  2.5%   25%   50%   75% 97.5% n_eff Rhat
## mu     0.84    0.00 0.08  0.68  0.79  0.84  0.89  1.00   769    1
## kappa  3.91    0.03 0.72  2.36  3.43  3.98  4.51  4.95   500    1
## psi    0.15    0.00 0.09 -0.07  0.10  0.16  0.21  0.32   447    1
## lp__  64.57    0.05 1.37 61.20 63.96 64.95 65.57 66.10   660    1
## 
## Samples were drawn using NUTS(diag_e) at Tue May 24 18:23:56 2016.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

##        true  estimate
## 1 0.7853982 0.8416541
## 2 4.0000000 3.9106862
## 3 0.1250000 0.1507968

fit_and_plot(mu=-pi/2, kappa=4, psi=1/8, iter=2500)

## 
## SAMPLING FOR MODEL '23368f84e08f2ce0726de8466b4ef4d3' NOW (CHAIN 1).
## 
## Chain 1, Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 1, Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 1, Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 1, Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 1, Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 1, Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 1, Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 1, Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 1, Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 1, Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 1, Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 1, Iteration: 1000 / 1000 [100%]  (Sampling)# 
## #  Elapsed Time: 1.069 seconds (Warm-up)
## #                0.406 seconds (Sampling)
## #                1.475 seconds (Total)
## # 
## 
## SAMPLING FOR MODEL '23368f84e08f2ce0726de8466b4ef4d3' NOW (CHAIN 1).
## 
## Chain 1, Iteration:    1 / 2500 [  0%]  (Warmup)
## Chain 1, Iteration:  250 / 2500 [ 10%]  (Warmup)
## Chain 1, Iteration:  500 / 2500 [ 20%]  (Warmup)
## Chain 1, Iteration:  750 / 2500 [ 30%]  (Warmup)
## Chain 1, Iteration: 1000 / 2500 [ 40%]  (Warmup)
## Chain 1, Iteration: 1001 / 2500 [ 40%]  (Sampling)
## Chain 1, Iteration: 1250 / 2500 [ 50%]  (Sampling)
## Chain 1, Iteration: 1500 / 2500 [ 60%]  (Sampling)
## Chain 1, Iteration: 1750 / 2500 [ 70%]  (Sampling)
## Chain 1, Iteration: 2000 / 2500 [ 80%]  (Sampling)
## Chain 1, Iteration: 2250 / 2500 [ 90%]  (Sampling)
## Chain 1, Iteration: 2500 / 2500 [100%]  (Sampling)# 
## #  Elapsed Time: 1.499 seconds (Warm-up)
## #                1.169 seconds (Sampling)
## #                2.668 seconds (Total)
## # 
## 
## SAMPLING FOR MODEL '23368f84e08f2ce0726de8466b4ef4d3' NOW (CHAIN 2).
## 
## Chain 2, Iteration:    1 / 2500 [  0%]  (Warmup)
## Chain 2, Iteration:  250 / 2500 [ 10%]  (Warmup)
## Chain 2, Iteration:  500 / 2500 [ 20%]  (Warmup)
## Chain 2, Iteration:  750 / 2500 [ 30%]  (Warmup)
## Chain 2, Iteration: 1000 / 2500 [ 40%]  (Warmup)
## Chain 2, Iteration: 1001 / 2500 [ 40%]  (Sampling)
## Chain 2, Iteration: 1250 / 2500 [ 50%]  (Sampling)
## Chain 2, Iteration: 1500 / 2500 [ 60%]  (Sampling)
## Chain 2, Iteration: 1750 / 2500 [ 70%]  (Sampling)
## Chain 2, Iteration: 2000 / 2500 [ 80%]  (Sampling)
## Chain 2, Iteration: 2250 / 2500 [ 90%]  (Sampling)
## Chain 2, Iteration: 2500 / 2500 [100%]  (Sampling)# 
## #  Elapsed Time: 9.933 seconds (Warm-up)
## #                1.455 seconds (Sampling)
## #                11.388 seconds (Total)
## # 
## Inference for Stan model: 23368f84e08f2ce0726de8466b4ef4d3.
## 2 chains, each with iter=2500; warmup=1000; thin=1; 
## post-warmup draws per chain=1500, total post-warmup draws=3000.
## 
##        mean se_mean   sd  2.5%   25%   50%   75% 97.5% n_eff Rhat
## mu    -1.64    0.00 0.06 -1.76 -1.68 -1.64 -1.60 -1.52  1279    1
## kappa  3.96    0.03 0.67  2.53  3.47  4.05  4.51  4.95   436    1
## psi    0.00    0.01 0.10 -0.25 -0.05  0.02  0.07  0.16   310    1
## lp__  78.36    0.05 1.23 75.14 77.77 78.69 79.27 79.79   739    1
## 
## Samples were drawn using NUTS(diag_e) at Tue May 24 18:25:01 2016.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

##        true     estimate
## 1 -1.570796 -1.637131794
## 2  4.000000  3.956281705
## 3  0.125000  0.001154499

Fitting a wrapped Cauchy:

fit_and_plot(mu=pi/2, kappa=3, psi=-1, iter=2500)

## rm: cannot remove 'tmp.def': Device or resource busy
## make: *** [file212428157f6.dll] Error 1
## Warning message:
## running command 'make -f "C:/PROGRA~1/R/R-33~1.0/etc/x64/Makeconf" -f "C:/PROGRA~1/R/R-33~1.0/share/make/winshlib.mk" -f "C:/Users/jpn.INFORMATICA/Documents/.R/Makevars" SHLIB_LDFLAGS='$(SHLIB_CXXLDFLAGS)' SHLIB_LD='$(SHLIB_CXXLD)' SHLIB="file212428157f6.dll" WIN=64 TCLBIN=64 OBJECTS="file212428157f6.o"' had status 2 
## 
## SAMPLING FOR MODEL '23368f84e08f2ce0726de8466b4ef4d3' NOW (CHAIN 1).
## 
## Chain 1, Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 1, Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 1, Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 1, Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 1, Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 1, Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 1, Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 1, Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 1, Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 1, Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 1, Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 1, Iteration: 1000 / 1000 [100%]  (Sampling)# 
## #  Elapsed Time: 0.507 seconds (Warm-up)
## #                0.343 seconds (Sampling)
## #                0.85 seconds (Total)
## # 
## 
## SAMPLING FOR MODEL '23368f84e08f2ce0726de8466b4ef4d3' NOW (CHAIN 1).
## 
## Chain 1, Iteration:    1 / 2500 [  0%]  (Warmup)
## Chain 1, Iteration:  250 / 2500 [ 10%]  (Warmup)
## Chain 1, Iteration:  500 / 2500 [ 20%]  (Warmup)
## Chain 1, Iteration:  750 / 2500 [ 30%]  (Warmup)
## Chain 1, Iteration: 1000 / 2500 [ 40%]  (Warmup)
## Chain 1, Iteration: 1001 / 2500 [ 40%]  (Sampling)
## Chain 1, Iteration: 1250 / 2500 [ 50%]  (Sampling)
## Chain 1, Iteration: 1500 / 2500 [ 60%]  (Sampling)
## Chain 1, Iteration: 1750 / 2500 [ 70%]  (Sampling)
## Chain 1, Iteration: 2000 / 2500 [ 80%]  (Sampling)
## Chain 1, Iteration: 2250 / 2500 [ 90%]  (Sampling)
## Chain 1, Iteration: 2500 / 2500 [100%]  (Sampling)# 
## #  Elapsed Time: 0.917 seconds (Warm-up)
## #                1.547 seconds (Sampling)
## #                2.464 seconds (Total)
## # 
## 
## SAMPLING FOR MODEL '23368f84e08f2ce0726de8466b4ef4d3' NOW (CHAIN 2).
## 
## Chain 2, Iteration:    1 / 2500 [  0%]  (Warmup)
## Chain 2, Iteration:  250 / 2500 [ 10%]  (Warmup)
## Chain 2, Iteration:  500 / 2500 [ 20%]  (Warmup)
## Chain 2, Iteration:  750 / 2500 [ 30%]  (Warmup)
## Chain 2, Iteration: 1000 / 2500 [ 40%]  (Warmup)
## Chain 2, Iteration: 1001 / 2500 [ 40%]  (Sampling)
## Chain 2, Iteration: 1250 / 2500 [ 50%]  (Sampling)
## Chain 2, Iteration: 1500 / 2500 [ 60%]  (Sampling)
## Chain 2, Iteration: 1750 / 2500 [ 70%]  (Sampling)
## Chain 2, Iteration: 2000 / 2500 [ 80%]  (Sampling)
## Chain 2, Iteration: 2250 / 2500 [ 90%]  (Sampling)
## Chain 2, Iteration: 2500 / 2500 [100%]  (Sampling)# 
## #  Elapsed Time: 0.935 seconds (Warm-up)
## #                1.34 seconds (Sampling)
## #                2.275 seconds (Total)
## # 
## Inference for Stan model: 23368f84e08f2ce0726de8466b4ef4d3.
## 2 chains, each with iter=2500; warmup=1000; thin=1; 
## post-warmup draws per chain=1500, total post-warmup draws=3000.
## 
##         mean se_mean   sd   2.5%    25%    50%    75%  97.5% n_eff Rhat
## mu      1.58    0.00 0.02   1.54   1.57   1.58   1.59   1.62  1033    1
## kappa   3.40    0.01 0.31   2.85   3.19   3.39   3.58   4.07  1149    1
## psi    -0.79    0.00 0.12  -1.05  -0.87  -0.78  -0.70  -0.56  1089    1
## lp__  141.49    0.04 1.24 138.26 140.94 141.85 142.39 142.87   832    1
## 
## Samples were drawn using NUTS(diag_e) at Tue May 24 18:25:57 2016.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

##        true   estimate
## 1  1.570796  1.5793295
## 2  3.000000  3.3989945
## 3 -1.000000 -0.7869314

TODO The model has trouble fitting for different values of $\mu$. Eg, fitting a Cartwright’s Power of Cosine:

fit_and_plot(mu=-pi/2  , kappa=5, psi=2)

## 
## SAMPLING FOR MODEL '23368f84e08f2ce0726de8466b4ef4d3' NOW (CHAIN 1).
## 
## Chain 1, Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 1, Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 1, Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 1, Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 1, Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 1, Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 1, Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 1, Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 1, Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 1, Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 1, Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 1, Iteration: 1000 / 1000 [100%]  (Sampling)# 
## #  Elapsed Time: 8.536 seconds (Warm-up)
## #                0.436 seconds (Sampling)
## #                8.972 seconds (Total)
## # 
## 
## SAMPLING FOR MODEL '23368f84e08f2ce0726de8466b4ef4d3' NOW (CHAIN 1).
## 
## Chain 1, Iteration:    1 / 5000 [  0%]  (Warmup)
## Chain 1, Iteration:  500 / 5000 [ 10%]  (Warmup)
## Chain 1, Iteration: 1000 / 5000 [ 20%]  (Warmup)
## Chain 1, Iteration: 1001 / 5000 [ 20%]  (Sampling)
## Chain 1, Iteration: 1500 / 5000 [ 30%]  (Sampling)
## Chain 1, Iteration: 2000 / 5000 [ 40%]  (Sampling)
## Chain 1, Iteration: 2500 / 5000 [ 50%]  (Sampling)
## Chain 1, Iteration: 3000 / 5000 [ 60%]  (Sampling)
## Chain 1, Iteration: 3500 / 5000 [ 70%]  (Sampling)
## Chain 1, Iteration: 4000 / 5000 [ 80%]  (Sampling)
## Chain 1, Iteration: 4500 / 5000 [ 90%]  (Sampling)
## Chain 1, Iteration: 5000 / 5000 [100%]  (Sampling)# 
## #  Elapsed Time: 2.344 seconds (Warm-up)
## #                4.976 seconds (Sampling)
## #                7.32 seconds (Total)
## # 
## 
## SAMPLING FOR MODEL '23368f84e08f2ce0726de8466b4ef4d3' NOW (CHAIN 2).
## 
## Chain 2, Iteration:    1 / 5000 [  0%]  (Warmup)
## Chain 2, Iteration:  500 / 5000 [ 10%]  (Warmup)
## Chain 2, Iteration: 1000 / 5000 [ 20%]  (Warmup)
## Chain 2, Iteration: 1001 / 5000 [ 20%]  (Sampling)
## Chain 2, Iteration: 1500 / 5000 [ 30%]  (Sampling)
## Chain 2, Iteration: 2000 / 5000 [ 40%]  (Sampling)
## Chain 2, Iteration: 2500 / 5000 [ 50%]  (Sampling)
## Chain 2, Iteration: 3000 / 5000 [ 60%]  (Sampling)
## Chain 2, Iteration: 3500 / 5000 [ 70%]  (Sampling)
## Chain 2, Iteration: 4000 / 5000 [ 80%]  (Sampling)
## Chain 2, Iteration: 4500 / 5000 [ 90%]  (Sampling)
## Chain 2, Iteration: 5000 / 5000 [100%]  (Sampling)# 
## #  Elapsed Time: 44.556 seconds (Warm-up)
## #                559.987 seconds (Sampling)
## #                604.543 seconds (Total)
## # 
## Inference for Stan model: 23368f84e08f2ce0726de8466b4ef4d3.
## 2 chains, each with iter=5000; warmup=1000; thin=1; 
## post-warmup draws per chain=4000, total post-warmup draws=8000.
## 
##        mean se_mean   sd  2.5%   25%   50%   75% 97.5% n_eff Rhat
## mu    -1.64    0.01 0.15 -1.96 -1.74 -1.62 -1.58 -1.34   120 1.03
## kappa  1.78    0.58 1.17  0.71  0.99  1.20  2.29  4.65     4 1.11
## psi   -1.63    0.37 1.96 -4.83 -3.32 -1.57 -0.38  2.10    28 1.04
## lp__  19.17    0.13 1.94 14.79 18.19 19.56 20.39 22.26   210 1.01
## 
## Samples were drawn using NUTS(diag_e) at Tue May 24 18:37:05 2016.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

##        true  estimate
## 1 -1.570796 -1.644850
## 2  5.000000  1.778364
## 3  2.000000 -1.631392

fit_and_plot(mu=0      , kappa=5, psi=2)

## 
## SAMPLING FOR MODEL '23368f84e08f2ce0726de8466b4ef4d3' NOW (CHAIN 1).
## 
## Chain 1, Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 1, Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 1, Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 1, Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 1, Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 1, Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 1, Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 1, Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 1, Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 1, Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 1, Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 1, Iteration: 1000 / 1000 [100%]  (Sampling)# 
## #  Elapsed Time: 0.53 seconds (Warm-up)
## #                0.408 seconds (Sampling)
## #                0.938 seconds (Total)
## # 
## 
## SAMPLING FOR MODEL '23368f84e08f2ce0726de8466b4ef4d3' NOW (CHAIN 1).
## 
## Chain 1, Iteration:    1 / 5000 [  0%]  (Warmup)
## Chain 1, Iteration:  500 / 5000 [ 10%]  (Warmup)
## Chain 1, Iteration: 1000 / 5000 [ 20%]  (Warmup)
## Chain 1, Iteration: 1001 / 5000 [ 20%]  (Sampling)
## Chain 1, Iteration: 1500 / 5000 [ 30%]  (Sampling)
## Chain 1, Iteration: 2000 / 5000 [ 40%]  (Sampling)
## Chain 1, Iteration: 2500 / 5000 [ 50%]  (Sampling)
## Chain 1, Iteration: 3000 / 5000 [ 60%]  (Sampling)
## Chain 1, Iteration: 3500 / 5000 [ 70%]  (Sampling)
## Chain 1, Iteration: 4000 / 5000 [ 80%]  (Sampling)
## Chain 1, Iteration: 4500 / 5000 [ 90%]  (Sampling)
## Chain 1, Iteration: 5000 / 5000 [100%]  (Sampling)# 
## #  Elapsed Time: 1.154 seconds (Warm-up)
## #                2.558 seconds (Sampling)
## #                3.712 seconds (Total)
## # 
## 
## SAMPLING FOR MODEL '23368f84e08f2ce0726de8466b4ef4d3' NOW (CHAIN 2).
## 
## Chain 2, Iteration:    1 / 5000 [  0%]  (Warmup)
## Chain 2, Iteration:  500 / 5000 [ 10%]  (Warmup)
## Chain 2, Iteration: 1000 / 5000 [ 20%]  (Warmup)
## Chain 2, Iteration: 1001 / 5000 [ 20%]  (Sampling)
## Chain 2, Iteration: 1500 / 5000 [ 30%]  (Sampling)
## Chain 2, Iteration: 2000 / 5000 [ 40%]  (Sampling)
## Chain 2, Iteration: 2500 / 5000 [ 50%]  (Sampling)
## Chain 2, Iteration: 3000 / 5000 [ 60%]  (Sampling)
## Chain 2, Iteration: 3500 / 5000 [ 70%]  (Sampling)
## Chain 2, Iteration: 4000 / 5000 [ 80%]  (Sampling)
## Chain 2, Iteration: 4500 / 5000 [ 90%]  (Sampling)
## Chain 2, Iteration: 5000 / 5000 [100%]  (Sampling)# 
## #  Elapsed Time: 7.324 seconds (Warm-up)
## #                8.879 seconds (Sampling)
## #                16.203 seconds (Total)
## # 
## Inference for Stan model: 23368f84e08f2ce0726de8466b4ef4d3.
## 2 chains, each with iter=5000; warmup=1000; thin=1; 
## post-warmup draws per chain=4000, total post-warmup draws=8000.
## 
##        mean se_mean   sd  2.5%   25%   50%   75% 97.5% n_eff Rhat
## mu    -0.07    0.00 0.17 -0.42 -0.18 -0.07  0.05  0.27  1979    1
## kappa  2.98    0.03 1.20  0.82  1.99  3.00  4.00  4.92  1574    1
## psi    2.79    0.04 1.20  1.40  1.98  2.48  3.26  6.03   787    1
## lp__  11.18    0.05 1.63  6.93 10.40 11.61 12.39 13.02   893    1
## 
## Samples were drawn using NUTS(diag_e) at Tue May 24 18:38:14 2016.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

##   true    estimate
## 1    0 -0.06728715
## 2    5  2.97930447
## 3    2  2.79361632

fit_and_plot(mu=-3*pi/4, kappa=5, psi=2)

## 
## SAMPLING FOR MODEL '23368f84e08f2ce0726de8466b4ef4d3' NOW (CHAIN 1).
## 
## Chain 1, Iteration:   1 / 1000 [  0%]  (Warmup)
## Chain 1, Iteration: 100 / 1000 [ 10%]  (Warmup)
## Chain 1, Iteration: 200 / 1000 [ 20%]  (Warmup)
## Chain 1, Iteration: 300 / 1000 [ 30%]  (Warmup)
## Chain 1, Iteration: 400 / 1000 [ 40%]  (Warmup)
## Chain 1, Iteration: 500 / 1000 [ 50%]  (Warmup)
## Chain 1, Iteration: 501 / 1000 [ 50%]  (Sampling)
## Chain 1, Iteration: 600 / 1000 [ 60%]  (Sampling)
## Chain 1, Iteration: 700 / 1000 [ 70%]  (Sampling)
## Chain 1, Iteration: 800 / 1000 [ 80%]  (Sampling)
## Chain 1, Iteration: 900 / 1000 [ 90%]  (Sampling)
## Chain 1, Iteration: 1000 / 1000 [100%]  (Sampling)# 
## #  Elapsed Time: 0.855 seconds (Warm-up)
## #                0.523 seconds (Sampling)
## #                1.378 seconds (Total)
## # 
## 
## SAMPLING FOR MODEL '23368f84e08f2ce0726de8466b4ef4d3' NOW (CHAIN 1).
## 
## Chain 1, Iteration:    1 / 5000 [  0%]  (Warmup)
## Chain 1, Iteration:  500 / 5000 [ 10%]  (Warmup)
## Chain 1, Iteration: 1000 / 5000 [ 20%]  (Warmup)
## Chain 1, Iteration: 1001 / 5000 [ 20%]  (Sampling)
## Chain 1, Iteration: 1500 / 5000 [ 30%]  (Sampling)
## Chain 1, Iteration: 2000 / 5000 [ 40%]  (Sampling)
## Chain 1, Iteration: 2500 / 5000 [ 50%]  (Sampling)
## Chain 1, Iteration: 3000 / 5000 [ 60%]  (Sampling)
## Chain 1, Iteration: 3500 / 5000 [ 70%]  (Sampling)
## Chain 1, Iteration: 4000 / 5000 [ 80%]  (Sampling)
## Chain 1, Iteration: 4500 / 5000 [ 90%]  (Sampling)
## Chain 1, Iteration: 5000 / 5000 [100%]  (Sampling)# 
## #  Elapsed Time: 1.276 seconds (Warm-up)
## #                2.426 seconds (Sampling)
## #                3.702 seconds (Total)
## # 
## 
## SAMPLING FOR MODEL '23368f84e08f2ce0726de8466b4ef4d3' NOW (CHAIN 2).
## 
## Chain 2, Iteration:    1 / 5000 [  0%]  (Warmup)
## Chain 2, Iteration:  500 / 5000 [ 10%]  (Warmup)
## Chain 2, Iteration: 1000 / 5000 [ 20%]  (Warmup)
## Chain 2, Iteration: 1001 / 5000 [ 20%]  (Sampling)
## Chain 2, Iteration: 1500 / 5000 [ 30%]  (Sampling)
## Chain 2, Iteration: 2000 / 5000 [ 40%]  (Sampling)
## Chain 2, Iteration: 2500 / 5000 [ 50%]  (Sampling)
## Chain 2, Iteration: 3000 / 5000 [ 60%]  (Sampling)
## Chain 2, Iteration: 3500 / 5000 [ 70%]  (Sampling)
## Chain 2, Iteration: 4000 / 5000 [ 80%]  (Sampling)
## Chain 2, Iteration: 4500 / 5000 [ 90%]  (Sampling)
## Chain 2, Iteration: 5000 / 5000 [100%]  (Sampling)# 
## #  Elapsed Time: 0.933 seconds (Warm-up)
## #                2.945 seconds (Sampling)
## #                3.878 seconds (Total)
## # 
## Inference for Stan model: 23368f84e08f2ce0726de8466b4ef4d3.
## 2 chains, each with iter=5000; warmup=1000; thin=1; 
## post-warmup draws per chain=4000, total post-warmup draws=8000.
## 
##        mean se_mean   sd  2.5%   25%   50%   75% 97.5% n_eff Rhat
## mu    -1.66    0.97 2.06 -2.84 -2.64 -2.53 -2.37  3.12     5 1.66
## kappa  2.97    0.08 1.19  0.87  2.01  3.00  3.96  4.94   210 1.01
## psi    1.94    0.34 1.17  0.82  1.32  1.65  2.16  5.24    12 1.15
## lp__  14.28    1.97 4.34  3.86 14.02 16.08 17.12 17.85     5 1.50
## 
## Samples were drawn using NUTS(diag_e) at Tue May 24 18:39:09 2016.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

##        true  estimate
## 1 -2.356194 -1.662193
## 2  5.000000  2.972718
## 3  2.000000  1.940459

We notice that even when the true distribution is the same – only an angular shift was performed – the fitness performance is quite different.

Nonparametric Methods

This section is entirely based on NPCirc: An R Package for Nonparametric Circular Methods, in Journal of Statistical Software, vol.61, 9

Nonparametric methods are useful to fit complex circular models (with many mixtures) where classical models are not flexible enough. Package NPCirc provides functions to deal with this type of problem.

library(NPCirc)

The package allows for sampling and finding the density of complex mixtures. The available distributions are:

circular uniform (use unif to identify this distribution on argument dist)
von Mises (vm)
cardioid (car)
wrapped Cauchy (wc)
wrapped normal (wn)
wrapped skew-normal (wsn)

The next eg shows a mixture of 70% of vonMise($\pi$/2, 4) and 30% of a wrapped Cauchy($\pi$,0.75):

t <- circular(seq(0,2*pi,len=500))

# sampling n points
n <- 250
params  <- list(p=c(.7,.3), mu=c(pi/2,pi), con=c(4,.75)) # mixture parameters
my_data <- rcircmix(n, dist=c("vm", "wc"), param=params)

# finding the density
dens <- dcircmix(x=t, dist=c("vm", "wc"), param=params)

# plotting them
plot(my_data, shrink=1.5)
lines(t, dens, lwd=2, col="blue")

Function kern.den.circ computes circular kernel regression estimation, using the von Mises as kernel. This algorithm nees a smoothing parameter that can be estimated using several different methods:

# Select a smoothing parameter for the non-parametric estimation
# there are several methods available
bw1 <- bw.boot(my_data)
bw2 <- bw.CV(my_data)
bw3 <- bw.rt(my_data)
bw4 <- bw.pi(my_data)

# select one, say bw4, and make the density estimation
est <- kern.den.circ(x=my_data, t=t, bw=bw4)

# plot it
par(mfrow=c(1,2))
plot(my_data, shrink=1.5)
lines(t, dens, lwd=2, col="blue")
lines(est, lwd=2, col="red")

xs <- seq(0,2*pi,len=500)
plot.default(my_data, rep(-.02,n), ylim=c(-.02,.6), col="grey40", ylab="")
lines(xs, dens, lwd=2, col="blue")
lines(xs, est$y, lwd=2, col="red")
abline(h=0, lty=4)

There are also methods for regression:

kern.reg.circ.lin which has circular covariates and linear response
kern.reg.circ.circ with circular covariates and response
kern.reg.lin.circ which has linear covariates and circular response

One eg with linear responses:

data("speed.wind2", package = "NPCirc") # speed.wind2 is a dataset available in the package
nas <- which(is.na(speed.wind2$Speed)) # find NA's for removal
dir <- circular(speed.wind2$Direction[-nas], units = "degrees") 
vel <- speed.wind2$Speed[-nas]

estLL <- kern.reg.circ.lin(dir, vel, method = "LL") # there's also method "NW"

par(mfrow=c(1,2))
plot(estLL, plot.type = "circle", points.plot = TRUE,
     labels = c("N", "NE", "E", "SE", "S", "SO", "O", "NO"),
     label.pos = seq(0, 7 * pi/4, by = pi/4),
     zero = pi/2, clockwise = TRUE, lwd = 2, line.col = "red", main = "")

plot(estLL, plot.type = "line", points.plot = TRUE, lwd = 2, line.col = "red",
     xlab = "direction", ylab = "speed (m/s)", main="")

One eg with circular responses:

data("wind", package = "NPCirc")
wind6  <- circular(wind$wind.dir[seq( 7, 1752, by = 24)]) # wind direction at 6am
wind12 <- circular(wind$wind.dir[seq(13, 1752, by = 24)]) # wind direction at noon

estLL <- kern.reg.circ.circ(wind6, wind12, bw = 2.25, method = "LL")

plot(estLL, plot.type = "line", points.plot = TRUE, line.col = 2, lwd = 2, xlab = "Wind direction at 6 a.m.", ylab = "Wind direction at noon", main="")

And an interactive plot:

plot(estLL, plot.type = "circle", points.plot = TRUE, line.col = 2, lwd = 2, points.col = 1, units = "degrees")

You must enable Javascript to view this page properly.

Finally, an eg with both circular covariates and response:

data("periwinkles", package = "NPCirc")
dist <- periwinkles$distance
dir <- circular(periwinkles$direction, units = "degrees")

estLL <- kern.reg.lin.circ(dist, dir, t = NULL, bw = 200, method = "LL")
plot(estLL, plot.type = "circle", points.plot = TRUE, line.col = 2, lwd = 2, points.col = 2)