Why the ltv prediction part use probabilitity prediction multiply expectation as the final ltv prediction？ #11

RuiSUN1124 · 2023-08-02T10:00:37Z

Ref:

lifetime_value/lifetime_value/zero_inflated_lognormal.py

Line 35 in dd41896

preds = (

Strategy24 · 2024-02-16T12:14:16Z

I also found it strange.

As far as I understood, regression part of the model is trained on the subset of customers who have observed nonzero LTV:

positive = tf.cast(labels > 0, tf.float32)

safe_labels = positive * labels + (
      1 - positive) * tf.keras.backend.ones_like(labels)

regression_loss = -tf.keras.backend.mean(
      positive * tfd.LogNormal(loc=loc, scale=scale).log_prob(safe_labels),
      axis=-1)`

If loc and scale give the best accuracy prediction on this subset of customers, then

preds = (positive_probs *
      tf.keras.backend.exp(loc + 0.5 * tf.keras.backend.square(scale)))

gives shifted estimation in general case, since positive_probs are not 0 or 1, but somewhere between them.

I think probability estimated by the classification part of the model should somehow be taken in consideration by the regression part of the model.

Ty4Code · 2024-04-02T16:57:54Z

It actually makes perfect sense if you think about what the intention of a zero-inflated log normal method is.

Imagine a simple case where a customer has an LTV of either 0$ with 99% probability, or has an LTV of exactly 100$ otherwise.

When we use a zero-inflated method for LTV, we are estimating the probability mass of zero LTV customers (classification) and we are estimating the conditional expected LTV for the non-zero LTV customers.

So in the case above, our perfect model would estimate the customer has a 1% chance of having a non-zero LTV and if they are non-zero LTV then we estimate their LTV EV to be 100$.

But if we just take the regression output then we would say the expected LTV of our customers is 100$ but this is clearly not true. We have to multiply the probability of the customer being non-zero by their expected LTV conditioned on them being non-zero.

If we assume that y is non-negative, then we can see that:

E(y) = P(y > 0) * E(y | y > 0) + P(y = 0) * E(y | y = 0)
E(y) = P(y > 0) * E(y | y > 0) + P(y = 0) * (0)
E(y) = P(y > 0) * E(y | y > 0)

Our model is essentially estimating P(y > 0) with the classification output and it is estimating E(y | y > 0) with the regression output.

So that is why we multiply the probability of non-zero LTV with the conditional customer expected LTV to get the true customer expected LTV that we care about which is E(y)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why the ltv prediction part use probabilitity prediction multiply expectation as the final ltv prediction？ #11

Why the ltv prediction part use probabilitity prediction multiply expectation as the final ltv prediction？ #11

RuiSUN1124 commented Aug 2, 2023 •

edited

Loading

Strategy24 commented Feb 16, 2024

Ty4Code commented Apr 2, 2024

Why the ltv prediction part use probabilitity prediction multiply expectation as the final ltv prediction？ #11

Why the ltv prediction part use probabilitity prediction multiply expectation as the final ltv prediction？ #11

Comments

RuiSUN1124 commented Aug 2, 2023 • edited Loading

Strategy24 commented Feb 16, 2024

Ty4Code commented Apr 2, 2024

RuiSUN1124 commented Aug 2, 2023 •

edited

Loading