add conditional_bernoulli distribution #27

goldingn · 2017-11-02T08:48:22Z

Add a distribution node type for this compound distribution, where the observed variable y can only take value 1 if a latent bernoulli variable z takes value 1, else it must take 0, i.e.:

y_i ~ bernoulli(z * p_i)
z ~ bernoulli(\psi)

p_i = p(y_i = 1 | z = 1)
\psi = p(z = 1)

Inference on p and psi is tractable when there are multiple bernoulli trials in each observation (where p take different values in each trial). The density of this compound distribution (i.e. the likelihood for y) can be calculated directly, explicitly integrating over the latent variable z.

This formulation underpins the ecological imperfect-detection model of MacKenzie et al. which gives the formulation of the likelihood/density.

For each observation in the imperfect detection model, y and p are vectors giving indicating whether a species was detected at each visit, and the probability of detection (which may vary between visits), and z and psi are scalars indicating whether the species was present (assumed the same at all visits) and the probability of being present.

The syntax would be:

distribution(y) <- conditional_bernoulli(p, psi)

(since the distribution is discrete, y can't be a variable), where y and p are vectors (or matrices) and psi is a scalar (or vector).

The text was updated successfully, but these errors were encountered:

goldingn · 2017-11-02T08:50:16Z

there are extensions to this model, for example introducing a parameter q_i = p(y_i = 1 | z_i = 0), corresponding to erroneous detections in the imperfect detection case. These could be added via additional, optional parameters

jdyen · 2018-10-05T00:08:27Z

could this be setup to take more flexible inputs, e.g., for a model with constant p or constant psi across multiple individuals?
z_i = bernoulli(\psi) or z_i = bernoulli(\psi_i)
with
y_ij = bernoulli(\psi * p) or y_ij = bernoulli(\psi * p_i) or y_ij = bernoulli(\psi * p_j)

So \psi could have dims (1 x 1) or (n x 1), and p could have dims (1 x 1), (n x 1), (1 x k), or (n x k), assuming y has dims (n x k).

My current way of handling this is:

# with a common psi for all individuals
psi <- beta(1, 1, dim = 1)
psi_expanded <- do.call('c', lapply(1:n, function(x) psi))

and

# with a single p for each individual
p <- beta(1, 1, dim = c(n, 1))
p_expanded <- do.call('cbind', lapply(1:k, function(x) p))

hrlai · 2021-03-28T04:30:42Z

Dear @goldingn and @jdyen , thank you pioneering these models in greta. I am exploring how to incorporate imperfect detection into abundance joint species distribution modelling after reading Tobler et al. (2019), which then led me to Yamaura et al. (2012).

However, I ran into an error while trying to implement Yamaura et al. (2012)'s model in greta:

Error: model contains a discrete random variable that doesn't have a fixed value, so cannot be sampled from

I think it is related to this thread, because I cannot use poisson for unobserved variable? This is also hinted in the helpfile. I hope to get some thoughts on whether I'm coding it incorrectly or is it something not yet implemented in greta?

Below is two code chunks, one to simulate abundance data and another to fit model (apologies if they look ugly!) :

Simulate data

# Simulate data to test fit_model.R script

# Following the data generative process in 
# Yamaura et al. (2012) Biodiversity and Conservation

set.seed(123)

# number of species
n_spp  <- 10   
# number of sites  
n_site <- 100 
# number of sampling events
t <- 10

# probability of detection for each species
# assuming the same p_j across sites for now
alpha0_true <- rnorm(n_spp, 2, 1)
mu_true     <- alpha0_true
sigma_true  <- rexp(n_spp, 2)
p_true      <- numeric(n_spp)
for (j in seq_len(n_spp)) {
    logit_p_true <- rnorm(1, mu_true, sigma_true)
    p_true[j] <- exp(logit_p_true) / (exp(logit_p_true) + 1)
}

# true abundance
# using Poisson for now (explore neg. binom. later)
beta0_true  <- rnorm(n_spp, 2, 1)
lambda_true <- exp(rep(1, n_site) %*% t(beta0_true))
N_true <- matrix(rpois(n_site * n_spp, lambda_true), n_site, n_spp)
    
# observed 
n_true <- array(NA, dim = c(n_site, n_spp, t))
for (i in seq_len(n_site)) {
    for (j in seq_len(n_spp)) {
        n_true[i, j, ] <- rbinom(t, N_true[i, j], p_true[j])
    }
}

Fit model

library(greta)

# source("code/sim_data.R")

# data
n <- as_data(n_true)

# variables
X <- ones(n_site)
V <- ones(n_site)

# priors
sd_beta0 <- exponential(2)
beta0    <- normal(0, sd_beta0, dim = n_spp)
lambda   <- exp(X %*% t(beta0))
N        <- poisson(lambda)    # I think this doesn't work as per helpfile
N <- do.call(abind, c(replicate(t, N, simplify = FALSE), along = 3))

sd_alpha0   <- exponential(2)
alpha0      <- normal(0, sd_alpha0, dim = n_spp)
mu          <- V %*% t(alpha0)
sigma       <- exponential(2, dim = n_spp)
sigma <- do.call(rbind, replicate(n_site, t(sigma), simplify = FALSE))
# p           <- ilogit(multivariate_normal(mu, sigma))
p           <- ilogit(normal(mu, sigma))
p <- do.call(abind, c(replicate(t, p, simplify = FALSE), along = 3))

# observational model
distribution(n) <- binomial(N, p)
mod <- model(N, p)

jdyen · 2021-03-28T22:54:43Z

I don't think this has been added to greta. A post on the greta forum might get some good answers (maybe do a quick search first, someone might have already posted a workaround).

hrlai · 2021-03-29T21:11:57Z

Thanks for the heads up. Sure I'll move over to the forum and apologies if here isn't the best place to ask :)

njtierney transferred this issue from greta-dev/greta.multivariate Dec 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add conditional_bernoulli distribution #27

add conditional_bernoulli distribution #27

goldingn commented Nov 2, 2017

goldingn commented Nov 2, 2017

jdyen commented Oct 5, 2018

hrlai commented Mar 28, 2021

jdyen commented Mar 28, 2021

hrlai commented Mar 29, 2021 •

edited

Loading

add conditional_bernoulli distribution #27

add conditional_bernoulli distribution #27

Comments

goldingn commented Nov 2, 2017

goldingn commented Nov 2, 2017

jdyen commented Oct 5, 2018

hrlai commented Mar 28, 2021

Simulate data

Fit model

jdyen commented Mar 28, 2021

hrlai commented Mar 29, 2021 • edited Loading

hrlai commented Mar 29, 2021 •

edited

Loading