Differentiability and approximation of probability functions under Gaussian mixture Models: A Bayesian approach

#this chunk is to install and require the knitr package, that will produce an html or pdf output report 
if (!require("knitr")) {
  install.packages("knitr") 
  require("knitr") 
}

## Loading required package: knitr

The setup of this example is that described in section 5 of the article “Differentiability and approximation of probability functions under Gaussian mixture Models: A Bayesian approach” using parameters, sample sizes and other numerical specifications chosen arbitrarily. You may download the file GMMpf.Rmd and compile it using R to change these parameters to your taste.

The example

Suppose \[\xi \sim \mathcal{N}\left(\begin{pmatrix} 0 \\ C \end{pmatrix},\begin{bmatrix} 1 & 0 \\ 0& 1 \end{bmatrix}\right), \] where the y coordinate \(C \in [-1,1]\) is such that \(\frac{C+1}{2}\) has, for some \(\delta>0\), the Beta(\(\delta,\delta\)) distribution, i.e. the density of random variable \(C\) is \(\frac{\Gamma(2\delta)}{2^{2\delta-1}\Gamma(\delta)^2}(1-c^2)^{\delta-1}\) for \(c\in [-1,1]\). The chance constraint \(g:\mathbb{R} \times \mathbb{R}^2\) is given by \[g(x,\xi)=||\xi||^2-x^2-2.\] It is clear that \(g(x,(0,c)^\top) \leq 0 \iff r\leq \rho_c(x,v)=-cv_2+\sqrt{x^2+2-c^2v_1^2}\). The radial probability function and its gradient are given by \[e_c(x,v)= \int_0^{\rho_c(x,v)}re^{-r^2/2}dt=1-e^{-\rho_c(x,v)^2/2},\] and \[\nabla_x e_c(x,v)= \rho_c(x,v)e^{-\rho_c(x,v)^2/2}\frac{x}{\sqrt{x^2+2-c^2v_1^2}}.\]

The probability function and its gradient are given by

\[\Phi(x) =\frac{\Gamma(2\delta)}{2^{2\delta-1}\Gamma(\delta)^2}\int_{-1}^1(1-c^2)^{\delta-1} F_\chi^{2,c^2}(x^2+2)dc,\] and \[\nabla \Phi (x)=\frac{x\Gamma(2\delta)}{2^{2\delta-2}\Gamma(\delta)^2}\int_{-1}^1(1-c^2)^{\delta-1} f_\chi^{2,c^2}(x^2+2)dc,\]

where \(F_\chi^{k,\lambda}\) denotes the cumulative distribution function of a non-central chi-squared random variable with \(k\) degrees of freedom and noncentrality parameter \(\lambda\), and \(f_\chi^{k,\lambda}\) denoting its corresponding density. We now conduct a simulation based estimation of \(\Phi\) and its gradient vía

\[\Phi_N(x)=\frac{1}{N}\sum_{i=1}^N e_{c_i}(x,v_i) \quad \text{ and } \quad \nabla\Phi_N(x)=\frac{1}{N}\sum_{i=1}^N \nabla_x e_{c_i}(x,v_i).\] comparing \(\Phi_N\) with the estimator based on the proportion of simulations verifying the constraint

\[\tilde{\Phi}_N(x)=\frac{1}{N}\sum_{i=1}^N 1_{ \{ g(x,\xi_i)\leq 0 \}}.\]

Parameter setup

set.seed(681986) #seeded for replicability, change or remove for alternate samples
N<-100 #sample size
d<-2.5 #delta parameter, see more details at end
xlow<-0 #lower bound for range of decision variable x
xup<-4 #upper bound for range of decision variable x
xvalues<-200 #number of different values of x in range to simulate the probability function and estimators

Simulation of probability function curves and functional estimates

estradial<-rep(0,xvalues)
estgradient<-rep(0,xvalues)
estprop<-rep(0,xvalues)
trueprob<-rep(0,xvalues)
truegradient<-rep(0,xvalues)
v<-runif(N,0,2*pi)
c<-2*rbeta(N,d,d)-1
x<-seq(xlow,xup, length.out=xvalues)
rs<-sqrt(sample(rchisq(N,2)))
xisq<-rs^2 + c^2+2*rs*c*sin(v)
for(i in 1:xvalues){
  rcrit<- -c*sin(v)+sqrt(x[i]^2+2 -c^2*cos(v)^2)
  estradial[i]<-mean(1-exp(-rcrit^2/2))
  estgradient[i]<-2*x[i]*mean(rcrit*exp(-rcrit^2/2)/sqrt(x[i]+2 -c^2*cos(v)^2))
  estprop[i]<-sum(xisq < 2+x[i]^2)/N
  cdf<-function(y){(1-y^2)^(d-1)*pchisq(x[i]^2+2,2,ncp = y^2)}
  trueprob[i]<-integrate(cdf,-1,1)$value
  dens<-function(y){(1-y^2)^(d-1)*dchisq(x[i]^2+2,2,ncp = y^2)}
  truegradient[i]<-integrate(dens,-1,1)$value*2*x[i]
}

trueprob<-gamma(2*d)*trueprob/(2^(2*d-1)*gamma(d)^2)
truegradient<-gamma(2*d)*truegradient/(2^(2*d-2)*gamma(d)^2)

Plots

par(mfrow=c(1,2))
plot(x,trueprob,type="l", ylab = "Probability", main="Constraint probability")
lines(x,estprop,col="blue")
lines(x,estradial,col="red")
legend("bottomright", col = c("black", "red", "blue"), lty = 1, legend = c(expression(Phi), expression(Phi[N]), expression(tilde(Phi)[N])))
plot(x,truegradient,type="l", ylab = "Derivative", main="Gradient of probability function")
lines(x,estgradient,col="red")
legend("topright", col = c("black", "red"), lty = 1, legend = c(expression(paste(nabla,Phi)), expression(paste(nabla,Phi[N]))))

Distribution of probability function curves and gradient at arbitrary x, using the same setup as before

k<-10000 #number of repetitions
x0<-1 #value at which to generate estimates for probability function and gradient
v2<-runif(k,0,2*pi)
c2<-2*rbeta(k,d,d)-1
r2<-c2*sin(v2)+sqrt(x0^2+2 -c2^2*cos(v2)^2)
par(mfrow=c(1,2))
hist(1-exp(-r2^2/2), main= expression(paste("Distribution of ",Phi[N],"(x)")), freq = FALSE, xlab = "value of estimate")
cdf<-function(y){(1-y^2)^(d-1)*pchisq(x0^2+2,2,ncp = y^2)}
phi<-integrate(cdf,-1,1)$value
phi<-gamma(2*d)*phi/(2^(2*d-1)*gamma(d)^2)
abline(v=phi, col="red")
hist(2*x0*r2*exp(-r2^2/2)/sqrt(x0^2+2 -c2^2*cos(v2)^2), main= expression(paste("Distribution of ",nabla,Phi[N],"(x)")), freq = FALSE, xlab = "value of estimate")
dens<-function(y){(1-y^2)^(d-1)*dchisq(x0^2+2,2,ncp = y^2)}
grad<-integrate(dens,-1,1)$value*2*x0
grad<-gamma(2*d)*grad/(2^(2*d-2)*gamma(d)^2)
abline(v=grad, col="red")

Visualization of beta priors

This part contains a visualization of the density of prior parameter \(C\) for different choices of \(\delta\): for small values, the density modes at the extremes; whereas for large values it modes at \(0\) and itself resembles a (truncated) normal distribution. Values of \(\delta\) near 1 produce no significant modal points and the distribution resembles a uniform distribution.

smalld<-0.5
medd<-1.01
larged<-10
par(mfrow=c(1,3))
z<-seq(0,1,length.out=1000)
plot(2*z-1,dbeta(z,smalld,smalld)/2,type="l", ylab = "Density", main="Small parameter (0.5)", xlab = "C")
plot(2*z-1,dbeta(z,medd,medd)/2,type="l", ylab = "Density", main="Parameter close to 1 (1.01)", xlab = "C")
plot(2*z-1,dbeta(z,larged,larged)/2,type="l", ylab = "Density", main="Large parameter (10)", xlab = "C")

Differentiability and approximation of probability functions under Gaussian mixture Models: A Bayesian approach

Gonzalo Contador, Pedro Perez-Aros and Emilio Vilches