For testing and diagnosis purposes, we want to compute the probability of a tie for first between two candidates in a three-candidate plurality contest.
For the integrand function, we assume that results are distributed according to \(\text{Dirichlet}(10, 8, 6)\).
alpha = c(10, 8, 6)
We draw a large number of simulated elections and see how often the first two candidates nearly tie:
set.seed(12345)
N <- 10000000
sims <- gtools::rdirichlet(N, alpha)
tol <- .01 # window defining a tie
dmc_result <- mean(sims[,1] > sims[,3] & #1 beats 3
sims[,2] > sims[,3] & # 2 beats 3
abs(sims[,1] - sims[,2]) < tol/2 # 1 and 2 are within tol
)/tol # divide by tol to approximate the integral we want
dmc_result
## [1] 1.58735
We integrate the integrand function at grid points along the line between \((1/3, 1/3, 0)\) and \((1/2, 1/2, 0)\):
n_midpoints <- 1000
breakpoints <- seq(from = 1/3, to = 1/2, length = n_midpoints + 1)
segment_length <- breakpoints[2] - breakpoints[1]
midpoints <- breakpoints[-length(breakpoints)] + segment_length/2
grid <- cbind(midpoints, midpoints, 1 - 2*midpoints)
fn_values <- gtools::ddirichlet(grid, alpha = alpha)
nigv_result <- sum(fn_values)*segment_length
nigv_result
## [1] 1.593612
The results agree pretty closely.
SimplicialCubature
The function we want to integrate is this:
dir_fn <- function(x, alpha){
gtools::ddirichlet(x = as.vector(x), alpha = alpha)
}
(See below why I write as.vector(x)
, which suggests a possible fix for John’s code.)
The S
matrix in this case has two columns identifying the points on the unit simplex between which we want to integrate:
S <- cbind(c(1/2, 1/2, 0), c(1/3, 1/3, 1/3))
S
## [,1] [,2]
## [1,] 0.5 0.3333333
## [2,] 0.5 0.3333333
## [3,] 0.0 0.3333333
If we pass it to our Dirichlet density function, we get:
sc_result <- SimplicialCubature::adaptIntegrateSimplex(dir_fn, S = S, alpha = alpha)
sc_result$integral
## [1] 3.903537
For a reminder, our earlier answers were 1.58735 (from direct Monte Carlo) and 1.5936125 (from a manual grid approach).
What explains the different results? Let’s see how the differences vary across the three possible ties we could have in the same election.
Here is the probability of each tie according to direct Monte Carlo:
ab_tie_for_first <- function(sims, tol = .01){
row_max <- apply(sims, 1, max)
mean((sims[,1] == row_max | sims[,2] == row_max) & abs(sims[,1] - sims[,2]) < tol/2)/tol
}
dmc_3 <- c(ab_tie_for_first(sims), ab_tie_for_first(sims[,c(1,3,2)]), ab_tie_for_first(sims[,c(2,3,1)]))
dmc_3
## [1] 1.59715 0.65581 0.37128
And here are the three SimplicialCubature
-based results:
sc_result_ac <- SimplicialCubature::adaptIntegrateSimplex(dir_fn, S = cbind(c(.5, 0, .5), c(1/3, 1/3, 1/3)), alpha = alpha)
sc_result_bc <- SimplicialCubature::adaptIntegrateSimplex(dir_fn, S = cbind(c(.0, .5, .5), c(1/3, 1/3, 1/3)), alpha = alpha)
sc_3 <- c(sc_result$integral, sc_result_ac$integral, sc_result_bc$integral)
sc_3
## [1] 3.9035369 1.5831646 0.8886139
sc_3/dmc_3
## [1] 2.444064 2.414060 2.393379
Same comparison but for different parameters on the integrand function:
alpha2 <- c(18, 15, 8)
sc_result_ab <- SimplicialCubature::adaptIntegrateSimplex(dir_fn, S = cbind(c(.5, .5, 0), c(1/3, 1/3, 1/3)), alpha = alpha2)
sc_result_ac <- SimplicialCubature::adaptIntegrateSimplex(dir_fn, S = cbind(c(.5, 0, .5), c(1/3, 1/3, 1/3)), alpha = alpha2)
sc_result_bc <- SimplicialCubature::adaptIntegrateSimplex(dir_fn, S = cbind(c(.0, .5, .5), c(1/3, 1/3, 1/3)), alpha = alpha2)
sc_3a <- c(sc_result_ab$integral, sc_result_ac$integral, sc_result_bc$integral)
sc_3a
## [1] 5.8889768 0.3037420 0.1781902
sims2 <- gtools::rdirichlet(N/10, alpha2)
dmc_3a <- c(ab_tie_for_first(sims2), ab_tie_for_first(sims2[,c(1,3,2)]), ab_tie_for_first(sims2[,c(2,3,1)]))
dmc_3a
## [1] 2.3888 0.1260 0.0729
sc_3a/dmc_3a
## [1] 2.465245 2.410651 2.444311
So the SimplicialCubature
results are consistently around 2.4 times bigger. What explains this? Is there a Jacobian issue somewhere?
Here is a different problem that involves integrating an area rather than integrating along a line: what is the probability that candidate 1 wins more than 50% of the vote?
The simulation answer:
sim_50 <- mean(sims[,1] > .5)
sim_50
## [1] 0.2024958
The SimplicialCubature
answer:
S_50 <- cbind(c(.5, .5, 0), c(.5, 0, .5), c(1,0,0))
sc_50 <- SimplicialCubature::adaptIntegrateSimplex(dir_fn, S = S_50, alpha = alpha)
sc_50$integral
## [1] 0.3506305
The answers are again different, but this time the ratio is \(\sqrt{3}\):
sc_50$integral/sim_50
## [1] 1.731544
As is the ratio for the same quantity for each of the other candidates:
b_50 <- SimplicialCubature::adaptIntegrateSimplex(dir_fn, S = cbind(c(.5, .5, 0), c(0, .5, .5), c(0,1,0)), alpha = alpha)
c_50 <- SimplicialCubature::adaptIntegrateSimplex(dir_fn, S = cbind(c(0, .5, .5), c(.5, 0, .5), c(0,0,1)), alpha = alpha)
sc_50s <- c(sc_50$integral, b_50$integral, c_50$integral)
dmc_50s <- c(mean(sims[,1] > .5), mean(sims[,2] > .5), mean(sims[,3] > .5))
sc_50s/dmc_50s
## [1] 1.731544 1.731836 1.729125
We can actually match the simulation-based result with SimplicialCubature
if we drop a dimension:
S_50_2 <- cbind(c(.5, .5), c(.5, 0), c(1,0))
dir_fn2 <- function(x, alpha){gtools::ddirichlet(c(x, 1 - sum(x)), alpha = alpha)}
sc_50_2 <- SimplicialCubature::adaptIntegrateSimplex(dir_fn2, S = S_50_2, alpha = alpha)
sc_50_2$integral
## [1] 0.2024365
But we can’t match the tie probabilities this way:
sc_ab <- SimplicialCubature::adaptIntegrateSimplex(dir_fn2, S = cbind(c(.5, .5), c(1/3, 1/3)), alpha = alpha)
sc_ac <- SimplicialCubature::adaptIntegrateSimplex(dir_fn2, S = cbind(c(.5, 0), c(1/3, 1/3)), alpha = alpha)
sc_bc <- SimplicialCubature::adaptIntegrateSimplex(dir_fn2, S = cbind(c(.0, .5), c(1/3, 1/3)), alpha = alpha)
sc_2d <- c(sc_ab$integral, sc_ac$integral, sc_bc$integral)
sc_2d
## [1] 2.2537081 1.4452249 0.8111898
The relationship between these tie probabilities and the simulation-based probabilities is not linear:
sc_2d/dmc_3
## [1] 1.411081 2.203725 2.184846
I have assumed that the simulation-based results are the target, so it seems like my use of SimplicialCubature
must be incorrect.
SimplicialCubature
I encountered an issue trying to use gtools::ddirichlet
as my integrand function, and I wondered if John might want to fix something in the SimplicialCubature
code to address it.
To highlight the issue, let’s integrate a function on the unit simplex. This works fine:
S <- cbind(c(1,0,0), c(0,1,0), c(0,0,1)) # same as UnitSimplexV(3)
sum_fn <- function(x){sum(x)}
SimplicialCubature::adaptIntegrateSimplex(sum_fn, S)
## $integral
## [1] 0.8660254
##
## $estAbsError
## [1] 8.660254e-13
##
## $functionEvaluations
## [1] 32
##
## $returnCode
## [1] 0
##
## $message
## [1] "OK"
But we encounter an error when we replace the integrand function with a Dirichlet density:
dir_fn_naive <- function(x, alpha){
gtools::ddirichlet(x = x, alpha = alpha)
}
SimplicialCubature::adaptIntegrateSimplex(dir_fn_naive, S, alpha = c(1,1,1))
## Error in gtools::ddirichlet(x = x, alpha = alpha): Mismatch between dimensions of 'x' and 'alpha'.
gtools::ddirichlet()
seems to think it’s being given an x
argument that is not of the same length as alpha
(3).
The problem seems to be that x
is a matrix, and gtools::ddirichlet()
is expecting a vector. If we change the format explicitly, no error:
dir_fn_fixed <- function(x, alpha){
gtools::ddirichlet(x = as.vector(x), alpha = alpha)
}
SimplicialCubature::adaptIntegrateSimplex(dir_fn_fixed, S, alpha = c(1,1,1))
## $integral
## [1] 1.732051
##
## $estAbsError
## [1] 1.732051e-12
##
## $functionEvaluations
## [1] 32
##
## $returnCode
## [1] 0
##
## $message
## [1] "OK"
We could consider altering John alter the code (presumably adsimp()
and integrate.vector.fn()
) to work with integrand functions that expect x
to be a vector?
S
matrix: John’s code and my code (my notes, really)John kindly sent me some code for generating the S
matrix given some vertices. Here I load his code and run an example he sent along:
source('R/TesselationsOfPointCloud.R')
library(mvmesh)
## Loading required package: rcdd
## If you want correct answers, use rational arithmetic.
## See the Warnings sections added to help pages for
## functions that do computational geometry.
## Loading required package: geometry
## Loading required package: abind
## Loading required package: SimplicialCubature
V <- cbind( c(1,0,1e-20), c(0,1,0), c(0,0,1), c(.25,.5,.25), c(.1,.4,.5) )
V
## [,1] [,2] [,3] [,4] [,5]
## [1,] 1e+00 0 0 0.25 0.1
## [2,] 0e+00 1 0 0.50 0.4
## [3,] 1e-20 0 1 0.25 0.5
S <- simplicesFromVertices( V )
S
## , , 1
##
## [,1] [,2] [,3]
## [1,] 0.25 1e+00 0
## [2,] 0.50 0e+00 1
## [3,] 0.25 1e-20 0
##
## , , 2
##
## [,1] [,2] [,3]
## [1,] 0.25 0 1e+00
## [2,] 0.50 0 0e+00
## [3,] 0.25 1 1e-20
##
## , , 3
##
## [,1] [,2] [,3]
## [1,] 0.1 0 0
## [2,] 0.4 0 1
## [3,] 0.5 1 0
##
## , , 4
##
## [,1] [,2] [,3]
## [1,] 0.1 0.25 0
## [2,] 0.4 0.50 1
## [3,] 0.5 0.25 0
##
## , , 5
##
## [,1] [,2] [,3]
## [1,] 0.1 0.25 0
## [2,] 0.4 0.50 0
## [3,] 0.5 0.25 1
open3d()
## null
## 1
points3d( t(V), size=10,col="red" )
for (i in 1:dim(S)[3]) {
wrap <- cbind( S[,,i], S[,1,i] )
lines3d(t(wrap), width=3, col='blue')
}
You must enable Javascript to view this page properly.
This code is useful for taking a set of vertices and tesselating it, i.e. rendering a bunch of simplices in the form of an S
matrix that we can then pass to the adaptIntegrateSimplex
function.
The approach I wrote up is based on determining the convex hull of a set of vertices and then extracting the facets relevant to my problem. The geometry::convhulln()
function represents the convex hull as triangulated/tesselated combinations of vertices, so my function takes the whole set of vertices for a candidate’s “win region”, get the triangulated/tesselated convex hull, and identifies those facets where a certain condition binds (e.g. \(a\) tying \(b\) in a particular way) at all vertices; these are the facets where I need to integrate my function.
In short, my code is more specific to my problem, but like John’s code it involves dropping a dimension (projecting down to \(\mathbb{R}^{k-1}\)) and tesselating.