Skip to content

Latest commit

 

History

History
65 lines (56 loc) · 3.03 KB

README.md

File metadata and controls

65 lines (56 loc) · 3.03 KB

Travis

Low-rank DPP Learning and Prediction

Julia implementation of low-rank determinantal point process (DPP) learning and prediction algorithms. Two learning algorithms are provided: the first is an optimization-based algorithm that uses stochastic gradient ascent (SGA), and the second is a Bayesian algorithm that uses stochastic gradient Hamiltonian Monte Carlo (SGHMC).

For details on low-rank DPPs, including SGA-based learning and prediction, see the Low-Rank Factorization of Determinantal Point Processes paper (slides). For more on Bayesian low-rank DPPs, see the Bayesian Low-Rank Determinantal Point Processes paper (slides).

Installation

Within Julia, use the package manager:

Pkg.add(PackageSpec(url="git://github.com/cgartrel/LowRankDPP.jl.git"))

Data Files

The Amazon baby registry dataset is included in the data/ directory. This dataset is described in the Expectation-Maximization for Learning Determinantal Point Processes paper.

Basic Usage

DPPExamples.jl contains a number of examples that show how to convert CSV data files into the JLD files required by this low-rank DPP package, perform low-rank DPP learning using the SGA and SGHMC learning algorithms, compute predictions using models generated by both types of learning algorithms, and compute prediction metrics (mean percentile rank and precision@N).

To run the examples for the full CSV data conversion, learning, prediction, and prediction metrics pipeline for SGA-based models, use the following functions from DPPExamples.jl:

using LowRankDPP

convertCsvToBasketsExample()
dppLearningExample()
predictionExample()
predictionMetricsExample()

To run the examples for the learning and prediction pipeline for SGHMC-based models, use the following functions from DPPExamples.jl:

using LowRankDPP

dppLearningBayesianExample()
predictionForMCMCSamplesExample()

The provided hyperparameter settings of the learning algorithms should work for most of the included Amazon baby registry data. However, these hyperparameters will likely need to be tuned for other datasets. In particular, the epsFixed, epsInitialDecay, and numIterationsFixedEps settings in doStochasticGradientAscent (from DPPLearning.jl) will need to be tuned to ensure proper convergence to a local maximum for SGA learning, while the stepSizeLarger, stepSizeIntermediate, stepSizeSmaller, numIterationsLargerStepSize, and numIterationsIntermediateStepSize settings in runStochasticGradientHamiltonianMonteCarloSampler (from DPPLearningBayesian.jl) will need to be tuned to ensure proper convergence to a local mode for SGHMC learning.