Errata and suggestions: Model to Meaning website #1304

DrJerryTAO · 2024-12-15T09:14:17Z

I have found some errors and areas to improve on the website https://marginaleffects.com/.

The navigation panel repeats "9 Categorical and ordinal outcomes".
At https://marginaleffects.com/chapters/experiments.html#factorial-experiments.
"In plant phyisiology, they could be used to ?? how combinations of temperature and humidity affect photosynthesis."
https://marginaleffects.com/chapters/experiments.html#conjoint-experiments
"To analyze this dataset, we estimate a linear regression model with choice as the outcome, and in which all predictors are interacted:" Should it not use multinomial regression?
https://marginaleffects.com/chapters/experiments.html#marginal-means
"Compute the predicted (i.e., fitted) values for each row in the original dataset. Marginalize (average) those predictions with respect to the variable of interest." This description seems to create average predictions in the empirical grid, not marginal means.
"To see if the average probability of selection is higher when a candidate is fluent in English, relative to when they require an interpreter, we use the hypothesis argument." The script hypothesis = "b1 = b3" needs updating due to version change.
https://marginaleffects.com/chapters/experiments.html#average-marginal-component-effects
"?sec-gcomputation" is not linked and defined.
https://marginaleffects.com/chapters/experiments.html#average-feature-choice-probability
"AMCE incorporates comparisons and averages over both direct and indirect attribute comparisons." What are direct and indirect comparisons? Better to supply a definition and give an example.
"These data are structured in what could be called a “long” format...we convert long to wide data using the [reshape()] function from base R." The data are still in the long format. No reshape was needed.
"Moreover, since the data is in “long” format, with one profile per row, we must also allow each variable to have different coefficients based on profile number: the effect of language on the probability of selection is obviously different for profile=1 and for profile=2. Indeed, when profile=1 the language column records the profile’s own language skills. When profile=2, the same column records the alternative profile’s language skills." This is not true. No profile predictor was used in any model. It should be used if the position or sequence of presenting the profiles matter when respondents tend to select the first or last option.
mod <- lm(choice ~ language * language.alt + job * job.alt, data = dat) I think the proper model to use is simply a standard logistic regression where each person-task has one row and profile is the dummy response. Even in a linear probability model, the second row in each task offers no additional information if language * language.alt + job * job.alt are used as predictors.
"Since we are not interested comparison pairs where both profiles have the same language skills, we use the subset to supply an appropriate grid." The description does not match the following script. The script should use newdata = subset(language.alt != language) instead.
"Is the AFCP for “used intepreter vs. fluent” different from the AFCP for “broken vs. fluent”?" The question does not match the following script.

The text was updated successfully, but these errors were encountered:

vincentarelbundock · 2024-12-15T11:40:22Z

Thanks so much for doing such a close read, and for reporting these issues. I really appreciate your time!

I'll fix those issues as soon as I find a few minutes.

DrJerryTAO · 2024-12-17T20:32:40Z

In 12 Mixed effects regression and post stratification https://marginaleffects.com/chapters/mrp.html

"Compute a weighted average of these predicted probabilities [is calculated] using population weights from the poststratification frame to produce the final MRP estimates."
https://marginaleffects.com/chapters/mrp.html#posterior-summaries "By default, marginaleffects functions report the mean of posterior draws, along with compute the mean of the posterior distribution of draws, along with equal-tailed intervals."

In 18 Alternative Software https://marginaleffects.com/bonus/alternative_software.html#fmeffects

The arguments of fme() has been updated https://holgstr.github.io/fmeffects/reference/fme.html. The older script on the website throws an error: Error in fmeffects::fme(model = forest, data = bikes, target = "count", : unused arguments (target = "count", step.size = 1)
avg_comparisons() for the lrn() model has some strange behavior. It appears that avg_comparisons() extract the first obs of predicted_lo predicted_hi predicted from comparisons, not their average. These three columns, however, are not mentioned in the help file. Also for all observations, predicted == predicted_lo in this particular model. But in some other model applications, I have seen predicted == predicted_hi. See the demonstration below.

library(marginaleffects)
library("mlr3verse")
library("fmeffects")

data("bikes", package = "fmeffects")
task <- as_task_regr(x = bikes, id = "bikes", target = "count")
forest <- lrn("regr.ranger")$train(task)

avg_comparisons(forest, variables = list(temp = 1), newdata = bikes) |> 
  data.frame()
"  term contrast estimate predicted_lo predicted_hi predicted
1 temp       +1 57.41196     1443.132     1606.049  1443.132"
comparisons <- comparisons(
  forest, variables = list(temp = 1), newdata = bikes)
comparisons[1, ] |> data.frame()
"  rowid term contrast estimate predicted_lo predicted_hi predicted ...
1     1 temp       +1 162.9178     1443.132     1606.049  1443.132 ...
It appears that avg_comparisons() extract the first obs for 
predicted_lo predicted_hi predicted, not their average"
with(comparisons, mean(predicted_hi - predicted_lo))
"57.41196 matches avg_comparisons()"
with(comparisons, all(predicted == predicted_lo))
"TRUE"

effects <- fme(
  model = forest, data = bikes, 
  features = list("temp" = 1), ep.method = "envelope")
summary(effects)
"Forward Marginal Effects Object
Step type: numerical
Features & step lengths: temp, 1
Extrapolation point detection: envelope, EPs: 3 of 731 obs. (0 %)
Average Marginal Effect (AME): 57.6256"
plot(effects)
effects$results
"Key: <obs.id>
     obs.id       fme
      <int>     <num>
  1:      1 162.91781
  2:      2 283.90225
  3:      3  49.42845
  4:      4  98.74693
  5:      5 195.97501
 ---                 
724:    727 374.40931
725:    728 230.96809
726:    729 377.85819
727:    730 565.06474
728:    731  28.26367"
effects$ame
"57.6256"

vincentarelbundock · 2024-12-19T01:43:36Z

thanks for these!

But I don't understand your last point. You can ignore the predicted columns here. What matters is the estimate. The estimate from avg_comparisons() is 57, which is the same as the $ame in fmeffects(). Then, the first estimate from comparisons() is 162, which is the same as the first estimate in $results. So that's consistent.

What am I missing?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Errata and suggestions: Model to Meaning website #1304

Errata and suggestions: Model to Meaning website #1304

DrJerryTAO commented Dec 15, 2024 •

edited

Loading

vincentarelbundock commented Dec 15, 2024

DrJerryTAO commented Dec 17, 2024

vincentarelbundock commented Dec 19, 2024

Errata and suggestions: Model to Meaning website #1304

Errata and suggestions: Model to Meaning website #1304

Comments

DrJerryTAO commented Dec 15, 2024 • edited Loading

vincentarelbundock commented Dec 15, 2024

DrJerryTAO commented Dec 17, 2024

vincentarelbundock commented Dec 19, 2024

DrJerryTAO commented Dec 15, 2024 •

edited

Loading