Issue generating adjusted predictions with wbm-models #622

Tan2525 · 2024-12-17T02:06:29Z

I'm facing some issue generating adjusted predictions with wbm models. My intention is to generate an interaction plot with ggpredict. However, it keeps issuing the following error:

Error in complete.cases(data[[variable]]) : 
  no input has determined the number of cases
In addition: Warning message:
In data.frame(..., check.names = FALSE) :
  row names were found from a short variable and have been discarded

I'm unable to resolve this error. May I know if the ggeffects package is compatible with wbm panel models?

I've included a replication dataset here: dataset__wide.csv.
Below is the code to prepare the dataset and run the model.

# Read csv.
dataset__wide <- read.csv(file = "dataset__wide.csv")

# Pivot data to long.
dataset__long <- dataset__wide %>%
  tidyr::pivot_longer(
    # Exclude the time-invariant variables
    !c(
      ID, 
      Control,
    ),
    names_to = "Variables", values_to = "Values"
  ) %>%
  dplyr::mutate(
    
    # Create a variable keeping track of the waves.
    Wave = case_when(
      str_detect(string = Variables, pattern = "t1") ~ 0,
      str_detect(string = Variables, pattern = "t2") ~ 1,
      TRUE ~ NA_real_
    ),

    # Create a variable to standardize the variable names.
    Variable = case_when(
      !(is.na(Variables)) ~ str_replace_all(string = Variables, pattern = "(_+((t1)|(t2)))", replacement = ""),
      TRUE ~ Variables
    ),
    
  ) %>% 
  dplyr::select(
    !Variables
  ) %>%
  tidyr::pivot_wider(names_from = Variable, values_from = "Values", values_fill = NA_real_)

# Create a panel data frame. 
dataset__long__panel <- panel_data(data = dataset__long, id = ID, wave = Wave)

# Fit the panel model
panel_model <- wbm(
  formula = DV ~ IV + M | Control | IV*M, 
  data = dataset__long__panel,
  family = binomial(link = "logit"),
  use.wave = TRUE,
  wave.factor = TRUE,
  weights = Weights,
  scale = TRUE,
  model = "between",
  control = glmerControl(optimizer = "bobyqa")
)

# Compute adjusted predictions
ggpredict(panel_model, terms = c("IV", "M"), bias_correction = TRUE) %>% plot()

The text was updated successfully, but these errors were encountered:

strengejacke · 2024-12-17T13:11:49Z

This could be an issue in panelr's predict() method. Maybe @jacob-long can help finding out whether this problem is related to predict(), or the ggeffects package?

FWIW, ggemmeans() works.

dataset__wide <- read.csv(file = "~/../Downloads/dataset__wide.csv")
library(panelr)
# Pivot data to long.
dataset__long <- dataset__wide |>
  tidyr::pivot_longer(
    # Exclude the time-invariant variables
    !c(
      ID, 
      Control,
    ),
    names_to = "Variables", values_to = "Values"
  ) |>
  dplyr::mutate(
    
    # Create a variable keeping track of the waves.
    Wave = dplyr::case_when(
      stringr::str_detect(string = Variables, pattern = "t1") ~ 0,
      stringr::str_detect(string = Variables, pattern = "t2") ~ 1,
      TRUE ~ NA_real_
    ),

    # Create a variable to standardize the variable names.
    Variable = dplyr::case_when(
      !(is.na(Variables)) ~ stringr::str_replace_all(string = Variables, pattern = "(_+((t1)|(t2)))", replacement = ""),
      TRUE ~ Variables
    ),
    
  ) |> 
  dplyr::select(
    !Variables
  ) |>
  tidyr::pivot_wider(names_from = Variable, values_from = "Values", values_fill = NA_real_)

# Create a panel data frame. 
dataset__long__panel <- panel_data(data = dataset__long, id = ID, wave = Wave)

# Fit the panel model
panel_model <- wbm(
  formula = DV ~ IV + M | Control | IV*M, 
  data = dataset__long__panel,
  family = binomial(link = "logit"),
  use.wave = TRUE,
  wave.factor = TRUE,
  weights = Weights,
  scale = TRUE,
  model = "between",
  control = glmerControl(optimizer = "bobyqa")
)

d <- expand.grid(lapply(dataset__long__panel[c("IV", "M")], unique))
predict(panel_model, newdata = d)
#> Error in complete.cases(data[[variable]]): no input has determined the number of cases

d <- ggeffects::data_grid(panel_model, c("IV", "M"))
predict(panel_model, newdata = d)
#> Unordered factor wave variable was converted to ordered. You should check
#> that the order is correct.
#> Error in complete.cases(data[[variable]]): no input has determined the number of cases

^{Created on 2024-12-17 with reprex v2.1.1}

Tan2525 · 2024-12-18T07:37:55Z

Thanks @strengejacke for the suggestion to use ggemmeans(). It successfully generated the adjusted predictions and using these, I was able to generate an interaction plot.
.

# Compute estimated predictions, with margin = "marginalmeans".
int_dat <- predict_response(
  model = panel_model,
  terms = c("IV", "M"),
  margin = "marginalmeans"
)

# Generate the interaction plot. 
ggplot(data = int_dat) + 
  geom_line(aes(x = x, y = predicted, colour = group))

You can consider my initial issue (of generating adjusted predictions) solved. However, to shed more light on the earlier problem, I was interested in the methodology applied by this paper: https://doi.org/10.1177/1940161224129270. The author(s) have (graciously) provided the replication dataset [(.Rdata)] (https://drive.google.com/file/d/1a-OPCFA3N0ZC8MNvtA2yp-AzVyXCrV75/view?usp=sharing) and code to generate the interaction plot online. I have extracted the relevant portions of the code below. I find it odd that ggpredict successfully ran for their model/dataset but encountered issues with my case, eventhough the model type is the same (wbm).

####Code for models of trust as a moderator####
library(panelr)
library(ggeffects)
library(ggplot2)

# Load the replication dataset. 
load(file = "replication.RData")

####main models reported in the paper for H3####
mtrust1 <- wbm(aff_pol_PT3 ~ partisan_right2_log + partisan_left2_log + no_partisan2_log + POL_INTEREST| 
                 partisan_right2_log*TRUST_GENERAL + partisan_left2_log*TRUST_GENERAL + no_partisan2_log*TRUST_GENERAL + 
                 GENDER + EDU + AGE + christians + race_rc, use.wave = TRUE, wave.factor = TRUE, data=replication)

ggpredict(mtrust1, terms = c("partisan_right2_log", "TRUST_GENERAL[1,3,5]")) %>% plot() + ylim(-4, 4) + 
  labs(y = "Social polarization", 
       x = "Frequency of use of right-leaning news sources",
       title = "Impact of trust and right-leaning news consumption on social polarization", 
       color = "Trust in news")  +
  scale_fill_manual(values = c("#F8766D", "#619CFF", "#00BA38")) +
  scale_color_manual(
    values = c("#F8766D", "#619CFF", "#00BA38"),
    labels = c("Do not trust at all", "Neither", "Trust completely"))

Anyways, just wanted to share that this package is really valuable to academic research and I would like to express my sincerest thanks to you (and the other maintainers) for all your efforts on this amazing package!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue generating adjusted predictions with wbm-models #622

Issue generating adjusted predictions with wbm-models #622

Tan2525 commented Dec 17, 2024 •

edited

Loading

strengejacke commented Dec 17, 2024

Tan2525 commented Dec 18, 2024 •

edited

Loading

Issue generating adjusted predictions with wbm-models #622

Issue generating adjusted predictions with wbm-models #622

Comments

Tan2525 commented Dec 17, 2024 • edited Loading

strengejacke commented Dec 17, 2024

Tan2525 commented Dec 18, 2024 • edited Loading

Tan2525 commented Dec 17, 2024 •

edited

Loading

Tan2525 commented Dec 18, 2024 •

edited

Loading