Comparing integration methods

Objective

For testing and diagnosis purposes, we want to compute the probability of a tie for first between two candidates in a three-candidate plurality contest.

For the integrand function, we assume that results are distributed according to \(\text{Dirichlet}(10, 8, 6)\).

alpha = c(10, 8, 6)

Direct Monte Carlo

We draw a large number of simulated elections and see how often the first two candidates nearly tie:

set.seed(12345)
N <- 10000000
sims <- gtools::rdirichlet(N, alpha)
tol <- .01 # window defining a tie
dmc_result <- mean(sims[,1] > sims[,3] & #1  beats 3
                     sims[,2] > sims[,3] & # 2 beats 3 
                     abs(sims[,1] - sims[,2]) < tol/2 # 1 and 2 are within tol
                   )/tol # divide by tol to approximate the integral we want 
dmc_result

## [1] 1.58735

Numerical integration via regular grid

We integrate the integrand function at grid points along the line between \((1/3, 1/3, 0)\) and \((1/2, 1/2, 0)\):

n_midpoints <- 1000
breakpoints <- seq(from = 1/3, to = 1/2, length = n_midpoints + 1)
segment_length <- breakpoints[2] - breakpoints[1]
midpoints <- breakpoints[-length(breakpoints)] + segment_length/2
grid <- cbind(midpoints, midpoints, 1 - 2*midpoints)
fn_values <- gtools::ddirichlet(grid, alpha = alpha)
nigv_result <- sum(fn_values)*segment_length
nigv_result

## [1] 1.593612

The results agree pretty closely.

Numerical integration via `SimplicialCubature`

The function we want to integrate is this:

dir_fn <- function(x, alpha){
  gtools::ddirichlet(x = as.vector(x), alpha = alpha)
}

(See below why I write as.vector(x), which suggests a possible fix for John’s code.)

The S matrix in this case has two columns identifying the points on the unit simplex between which we want to integrate:

S <- cbind(c(1/2, 1/2, 0), c(1/3, 1/3, 1/3))
S

##      [,1]      [,2]
## [1,]  0.5 0.3333333
## [2,]  0.5 0.3333333
## [3,]  0.0 0.3333333

If we pass it to our Dirichlet density function, we get:

sc_result <- SimplicialCubature::adaptIntegrateSimplex(dir_fn, S = S, alpha = alpha)
sc_result$integral

## [1] 3.903537

For a reminder, our earlier answers were 1.58735 (from direct Monte Carlo) and 1.5936125 (from a manual grid approach).

Diagnosis

What explains the different results? Let’s see how the differences vary across the three possible ties we could have in the same election.

Here is the probability of each tie according to direct Monte Carlo:

ab_tie_for_first <- function(sims, tol = .01){
  row_max <- apply(sims, 1, max)
  mean((sims[,1] == row_max | sims[,2] == row_max) & abs(sims[,1] - sims[,2]) < tol/2)/tol  
}

dmc_3 <- c(ab_tie_for_first(sims), ab_tie_for_first(sims[,c(1,3,2)]), ab_tie_for_first(sims[,c(2,3,1)]))
dmc_3

## [1] 1.59715 0.65581 0.37128

And here are the three SimplicialCubature-based results:

sc_result_ac <- SimplicialCubature::adaptIntegrateSimplex(dir_fn, S = cbind(c(.5, 0, .5), c(1/3, 1/3, 1/3)), alpha = alpha)
sc_result_bc <- SimplicialCubature::adaptIntegrateSimplex(dir_fn, S = cbind(c(.0, .5, .5), c(1/3, 1/3, 1/3)), alpha = alpha)

sc_3 <- c(sc_result$integral, sc_result_ac$integral, sc_result_bc$integral)
sc_3

## [1] 3.9035369 1.5831646 0.8886139

sc_3/dmc_3

## [1] 2.444064 2.414060 2.393379

Same comparison but for different parameters on the integrand function:

alpha2 <- c(18, 15, 8)
sc_result_ab <- SimplicialCubature::adaptIntegrateSimplex(dir_fn, S = cbind(c(.5, .5, 0), c(1/3, 1/3, 1/3)), alpha = alpha2)
sc_result_ac <- SimplicialCubature::adaptIntegrateSimplex(dir_fn, S = cbind(c(.5, 0, .5), c(1/3, 1/3, 1/3)), alpha = alpha2)
sc_result_bc <- SimplicialCubature::adaptIntegrateSimplex(dir_fn, S = cbind(c(.0, .5, .5), c(1/3, 1/3, 1/3)), alpha = alpha2)

sc_3a <- c(sc_result_ab$integral, sc_result_ac$integral, sc_result_bc$integral)
sc_3a

## [1] 5.8889768 0.3037420 0.1781902

sims2 <- gtools::rdirichlet(N/10, alpha2)
dmc_3a <- c(ab_tie_for_first(sims2), ab_tie_for_first(sims2[,c(1,3,2)]), ab_tie_for_first(sims2[,c(2,3,1)]))

dmc_3a

## [1] 2.3888 0.1260 0.0729

sc_3a/dmc_3a

## [1] 2.465245 2.410651 2.444311

So the SimplicialCubature results are consistently around 2.4 times bigger. What explains this? Is there a Jacobian issue somewhere?

A different problem

Here is a different problem that involves integrating an area rather than integrating along a line: what is the probability that candidate 1 wins more than 50% of the vote?

The simulation answer:

sim_50 <- mean(sims[,1] > .5)
sim_50

## [1] 0.2024958

The SimplicialCubature answer:

S_50 <- cbind(c(.5, .5, 0), c(.5, 0, .5), c(1,0,0))
sc_50 <- SimplicialCubature::adaptIntegrateSimplex(dir_fn, S = S_50, alpha = alpha)
sc_50$integral

## [1] 0.3506305

The answers are again different, but this time the ratio is \(\sqrt{3}\):

sc_50$integral/sim_50

## [1] 1.731544

As is the ratio for the same quantity for each of the other candidates:

b_50 <- SimplicialCubature::adaptIntegrateSimplex(dir_fn, S = cbind(c(.5, .5, 0), c(0, .5, .5), c(0,1,0)), alpha = alpha)
c_50 <- SimplicialCubature::adaptIntegrateSimplex(dir_fn, S = cbind(c(0, .5, .5), c(.5, 0, .5), c(0,0,1)), alpha = alpha)

sc_50s <- c(sc_50$integral, b_50$integral, c_50$integral)
dmc_50s <- c(mean(sims[,1] > .5), mean(sims[,2] > .5), mean(sims[,3] > .5))

sc_50s/dmc_50s

## [1] 1.731544 1.731836 1.729125

We can actually match the simulation-based result with SimplicialCubature if we drop a dimension:

S_50_2 <- cbind(c(.5, .5), c(.5, 0), c(1,0))
dir_fn2 <- function(x, alpha){gtools::ddirichlet(c(x, 1 - sum(x)), alpha = alpha)}
sc_50_2 <- SimplicialCubature::adaptIntegrateSimplex(dir_fn2, S = S_50_2, alpha = alpha)
sc_50_2$integral

## [1] 0.2024365

But we can’t match the tie probabilities this way:

sc_ab <- SimplicialCubature::adaptIntegrateSimplex(dir_fn2, S = cbind(c(.5, .5), c(1/3, 1/3)), alpha = alpha)
sc_ac <- SimplicialCubature::adaptIntegrateSimplex(dir_fn2, S = cbind(c(.5, 0), c(1/3, 1/3)), alpha = alpha)
sc_bc <- SimplicialCubature::adaptIntegrateSimplex(dir_fn2, S = cbind(c(.0, .5), c(1/3, 1/3)), alpha = alpha)

sc_2d <- c(sc_ab$integral, sc_ac$integral, sc_bc$integral)
sc_2d

## [1] 2.2537081 1.4452249 0.8111898

The relationship between these tie probabilities and the simulation-based probabilities is not linear:

sc_2d/dmc_3

## [1] 1.411081 2.203725 2.184846

I have assumed that the simulation-based results are the target, so it seems like my use of SimplicialCubature must be incorrect.

Suggestion for `SimplicialCubature`

I encountered an issue trying to use gtools::ddirichlet as my integrand function, and I wondered if John might want to fix something in the SimplicialCubature code to address it.

To highlight the issue, let’s integrate a function on the unit simplex. This works fine:

S <- cbind(c(1,0,0), c(0,1,0), c(0,0,1)) # same as UnitSimplexV(3)
sum_fn <- function(x){sum(x)}
SimplicialCubature::adaptIntegrateSimplex(sum_fn, S)

## $integral
## [1] 0.8660254
## 
## $estAbsError
## [1] 8.660254e-13
## 
## $functionEvaluations
## [1] 32
## 
## $returnCode
## [1] 0
## 
## $message
## [1] "OK"

But we encounter an error when we replace the integrand function with a Dirichlet density:

dir_fn_naive <- function(x, alpha){
  gtools::ddirichlet(x = x, alpha = alpha)
}
SimplicialCubature::adaptIntegrateSimplex(dir_fn_naive, S, alpha = c(1,1,1))

## Error in gtools::ddirichlet(x = x, alpha = alpha): Mismatch between dimensions of 'x' and 'alpha'.

gtools::ddirichlet() seems to think it’s being given an x argument that is not of the same length as alpha (3).

The problem seems to be that x is a matrix, and gtools::ddirichlet() is expecting a vector. If we change the format explicitly, no error:

dir_fn_fixed <- function(x, alpha){
  gtools::ddirichlet(x = as.vector(x), alpha = alpha)
}
SimplicialCubature::adaptIntegrateSimplex(dir_fn_fixed, S, alpha = c(1,1,1))

## $integral
## [1] 1.732051
## 
## $estAbsError
## [1] 1.732051e-12
## 
## $functionEvaluations
## [1] 32
## 
## $returnCode
## [1] 0
## 
## $message
## [1] "OK"

We could consider altering John alter the code (presumably adsimp() and integrate.vector.fn()) to work with integrand functions that expect x to be a vector?

Generating the `S` matrix: John’s code and my code (my notes, really)

John kindly sent me some code for generating the S matrix given some vertices. Here I load his code and run an example he sent along:

source('R/TesselationsOfPointCloud.R')
library(mvmesh)

## Loading required package: rcdd

## If you want correct answers, use rational arithmetic.
## See the Warnings sections added to help pages for
##     functions that do computational geometry.

## Loading required package: geometry

## Loading required package: abind

## Loading required package: SimplicialCubature

V <- cbind( c(1,0,1e-20), c(0,1,0), c(0,0,1), c(.25,.5,.25), c(.1,.4,.5) )
V

##       [,1] [,2] [,3] [,4] [,5]
## [1,] 1e+00    0    0 0.25  0.1
## [2,] 0e+00    1    0 0.50  0.4
## [3,] 1e-20    0    1 0.25  0.5

S <- simplicesFromVertices( V )
S

## , , 1
## 
##      [,1]  [,2] [,3]
## [1,] 0.25 1e+00    0
## [2,] 0.50 0e+00    1
## [3,] 0.25 1e-20    0
## 
## , , 2
## 
##      [,1] [,2]  [,3]
## [1,] 0.25    0 1e+00
## [2,] 0.50    0 0e+00
## [3,] 0.25    1 1e-20
## 
## , , 3
## 
##      [,1] [,2] [,3]
## [1,]  0.1    0    0
## [2,]  0.4    0    1
## [3,]  0.5    1    0
## 
## , , 4
## 
##      [,1] [,2] [,3]
## [1,]  0.1 0.25    0
## [2,]  0.4 0.50    1
## [3,]  0.5 0.25    0
## 
## , , 5
## 
##      [,1] [,2] [,3]
## [1,]  0.1 0.25    0
## [2,]  0.4 0.50    0
## [3,]  0.5 0.25    1

open3d()

## null 
##    1

  points3d( t(V), size=10,col="red" )
  for (i in 1:dim(S)[3]) {
    wrap <- cbind( S[,,i], S[,1,i] )
    lines3d(t(wrap), width=3, col='blue')
  }

You must enable Javascript to view this page properly.

This code is useful for taking a set of vertices and tesselating it, i.e. rendering a bunch of simplices in the form of an S matrix that we can then pass to the adaptIntegrateSimplex function.

The approach I wrote up is based on determining the convex hull of a set of vertices and then extracting the facets relevant to my problem. The geometry::convhulln() function represents the convex hull as triangulated/tesselated combinations of vertices, so my function takes the whole set of vertices for a candidate’s “win region”, get the triangulated/tesselated convex hull, and identifies those facets where a certain condition binds (e.g. \(a\) tying \(b\) in a particular way) at all vertices; these are the facets where I need to integrate my function.

In short, my code is more specific to my problem, but like John’s code it involves dropping a dimension (projecting down to \(\mathbb{R}^{k-1}\)) and tesselating.

Comparing integration methods

Andy Eggers

8/18/2020

Objective

Direct Monte Carlo

Numerical integration via regular grid

Numerical integration via `SimplicialCubature`

Diagnosis

A different problem

Suggestion for `SimplicialCubature`

Generating the `S` matrix: John’s code and my code (my notes, really)

Comparing integration methods

Andy Eggers

8/18/2020

Objective

Direct Monte Carlo

Numerical integration via regular grid

Numerical integration via SimplicialCubature

Diagnosis

A different problem

Suggestion for SimplicialCubature

Generating the S matrix: John’s code and my code (my notes, really)

Numerical integration via `SimplicialCubature`

Suggestion for `SimplicialCubature`

Generating the `S` matrix: John’s code and my code (my notes, really)