valuepairs #15

zsunberg · 2019-09-30T20:07:53Z

Currently, we have actionvalues to get a vector of values, one for each action in the POMDP. Making sure these values always match up to the right actions seems rather error prone; it might be more helpful to have a valuepairs function that returns a list of action-value pairs so no one can get confused about which action each value corresponds to.

The text was updated successfully, but these errors were encountered:

rejuvyesh · 2019-09-30T20:11:02Z

The vector representation is likely more efficient for DeepRL algorithms.

MaximeBouton · 2019-09-30T20:53:07Z

So far we managed the order internally by either having an actionmap field in the policy object or by making sure we use ordered_actions. I agree that it could be confusing, especially if actionindex does not agree with the ordering from whatever actions returns.
valuepairs would raise any ambiguity, not sure about the name.

zsunberg · 2019-10-01T02:28:01Z

The vector representation is likely more efficient for DeepRL algorithms.

Yes, though only very slightly if things are type-stable. I think we should have both.

zsunberg mentioned this issue Sep 30, 2019

fix bug in actionvalues alphavectorpolicy #14

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

valuepairs #15

valuepairs #15

zsunberg commented Sep 30, 2019

rejuvyesh commented Sep 30, 2019

MaximeBouton commented Sep 30, 2019

zsunberg commented Oct 1, 2019

valuepairs #15

valuepairs #15

Comments

zsunberg commented Sep 30, 2019

rejuvyesh commented Sep 30, 2019

MaximeBouton commented Sep 30, 2019

zsunberg commented Oct 1, 2019