Skip to content

0.5.3

Compare
Choose a tag to compare
@usaito usaito released this 03 Apr 23:08
· 41 commits to master since this release
c25bd6f

Updates

  • Implement a synthetic data generator class for OPE with action embeddings (obp.dataset.SyntheticBanditDatasetWithActionEmbeds) and an estimator leveraging the action embeddings (obp.ope.MarginalziedInverseProbabilityWeighting) (#155 )
  • Implement several OPE estimators for the multiple logger setting (#154 )

References

  • Yuta Saito and Thorsten Joachims. "Off-Policy Evaluation for Large Action Spaces via Embeddings." arXiv2022.
  • Aman Agarwal, Soumya Basu, Tobias Schnabel, Thorsten Joachims. "Effective Evaluation using Logged Bandit Feedback from Multiple Loggers.", KDD2018.
  • Nathan Kallus, Yuta Saito, and Masatoshi Uehara. "Optimal Off-Policy Evaluation from Multiple Logging Policies.", ICML2021.