-
-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sensitivity to Priors in Bayesian Inference #437
Comments
One aspect that could be interesting is the interaction between prior width and location, in other words, the interaction between the effect expectation and its certainty, which would be reflected by a simulation that modulates independently these both aspects. Maybe some (I don't have any more precise hypotheses here yet though) that some indices would be more sensitive to location and others to precision?
Sad to hear that, seems like you would be a great fit for it, but I am sure you have better plans in mind :) Let's discuss it uppon Daniel's return, which happens in SW Episode 6 (aka in a few days I think) |
If by effect certainty you mean "how wide is the prior" or "how informative is the data is", then I think this falls under my first hypothesis:
Thanks, Dom! I do love research, but I equally hate all the academia fluff - reminds me of one of my favorite tweets:
I'll head out to the real world, and start the credibility crisis in industry (oh boy...) 😋 |
Many relevant thoughts here! We the flat-priors-support-the-null very strongly for BIC-based Bayes Factors which is close to the most uninformative thing you can get (Wagenmakers, 2007). I have recently begun thinking about priors as the model itself. For example, a model with 10 parameters with wide priors can easily be more complex than a model with 12 parameters with narrow priors, if the prior predictive space is larger in the former. At a more extreme end, the very inclusion of a parameter in the model is a 100% prior that it exists, and excluding one is a 100% prior that it doesn't (as seen from the model). So the issues about how the prior affects the posterior and the relation between the two is really the same issue as choosing which parameters to include in the model. I am not good at reading up on the literature, so this may have been discussed extensively already without me noticing. But at least, it is my impression that this has not penetrated public discourse about models/priors yet, and I think it makes priors less terrifying (by making model building more terrifying, I guess...). As a side note, I am finalizing a package where you can fix parameters to specific values (or other parameters) via the prior: https://lindeloev.github.io/mcp/articles/priors.html. Hoping to make it public today or tomorrow. |
Sounds very interesting, and would fit into my current plannings on a more methodology based habilitation :-)
Indeed sad, but of course you know you can't leave the easystats project, no matter where you are! :-)
Since I now start working on a prior-tutorial paper, maybe @DominiqueMakowski can take the lead? Not sure about the time frame though, and if funding for open access publishing might be an issue, maybe we'll find a solution (don't see much problems here right now). |
Hi @lindeloev! Really interesting stuff! Here are my thought on your thoughts: The BIC-based BF's approximate prior is far from flat - if it were, all resulting BFs would be highly supportive of the null. In Wagenmakers' (2007) Appendix B he describes the assumed prior, and though it is wide, it is not too wide (he calls it reasonable "noninformative", which made me laugh).
I generally agree that the topic of "how do priors affect posteriors" should always include point priors (be they "null" points at 0, or fixed points as in your
Should have known not so sign that piece of paper...👿 |
Oh, I should have read that appendix (and https://pdfs.semanticscholar.org/36ee/f823310b020648d1b254ca1e35e3362655d1.pdf which discuss the implications of prior width on BFs) in more detail. Now I'm feeling like an imposter :-) It has just been very wide for the analyses I've been doing. I still need to find the formula for computing a BIC prior - though computationally it may just be like taking a flat prior, and updating it with one data point on a normal likelihood? Yeah, discussion of point priors may be beside the point on a paper like this (though I just find it really intriguing to think of a model as having point-null priors for all parameters in the universe except those "included"). I guess I was just replying to
saying that the subjectivity enters already in deciding which parameters to include in our model. Just to take some of that criticism/worry off the prior distributions. Though I'm worrying that I'm being way too philosophical now :-) |
Interesting reference - add to me "to read" list - thanks!
That sounds super interesting... I wonder if that will be equivalent - keep us (me!) posted!
Definitely need to address this in the paper. |
there's a paragraph on exactly this topic (however, advocating for careful use of BF) here: |
Since I feel in the force that @mattansb has some great ideas about that paper, I feel he should still be first without considering his eventual abandonment of academia (and you never know what the future has in store :) (plus, I want to see the Ben-Sachar inverse paradox becoming real 😋) Me and Daniel can share the remaining last because why not 😁
Although in a perfect world I'd prefer open access publishing, publishing is the priority over not publishing, so if we don't have funding we should still submit to a non-OA journal. And nowadays, with preprints NOW LET'S START THINKING about the important What do we compare
Of whatLogistic and Linear Model similar to paper1? On what
Once we have a small running script, I can run it on the server. |
Unlike you guys, I am not a full As I started saying above, I think this paper can be lighter on the simulations than the previous one. That is, we can explain with math the implications of the location and scale of priors on each of the measures, and then have a figure, derived from simulations, to drive the point home. (Like, the simulations aren't so much the leading result, but as visual + semi-empirical aid). The effects of interest are:
As we can essentially write this whole paper up before a single simulation is made with maths / Bayesian logic alone (it could in fact be a semi-introduction to Bayesian concepts), we might want to start with that? |
Yes, you are right. How about:
|
Should this conversation move to |
Jeff Rouder and co recently posted a preprint comparing BF to information criteria like WAIC where they discussed sensitivity to priors somewhat. I disagree with the interpretation of their results somewhat, but may be useful to refer to or cite. https://psyarxiv.com/e6g9d/ |
@strengejacke @DominiqueMakowski Let's open this up here officially.
I think this paper can be quite short, really. Here I am pre-regestering my hypotheses:
For posterior-based indices (which is all but the BFs), the degree by which they are affected by priors looks like this:
That is, weak-to-flat priors have close to no effect on the indices, but as the priors grow more narrow, the indices become reflections of the prior, rather than the data (or a mix of the prior and the data). This is the reverse Jeffreys-Lindley-Bartlett paradox.
For BFs, we will get:
When can then talk about why it is dangerous to treat the posterior based indices as "objective", and why BFs should only be used when researchers have informative (weak or strong) priors they want to test. That is, posterior indices are descriptive of the posterior (= prior + data, and not the data alone), while BFs are not descriptive of the data / posterior, they are indicative of the match between data + prior.
Where we might run into trouble is with reviewers asking we suggest priors - so we should emphasis that our focus is not on correct prior selection, but on correct inference using Bayesian indices.
As for who should take the helm on this paper... My plan is to leave academia post PhD (but of course I will be an r-developer 4EVA!), so first-authored papers will serve you two better than they will serve me. So this is up to you guys 😁
The text was updated successfully, but these errors were encountered: