Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistency in action values #44

Open
markroxor opened this issue Sep 24, 2018 · 4 comments
Open

Inconsistency in action values #44

markroxor opened this issue Sep 24, 2018 · 4 comments

Comments

@markroxor
Copy link
Collaborator

It is expected that world.step receives a array of binary values as returned by brain.sense_act_learn which is returned by self.postprocessor.convert_to_actions but it's documentation says that it returns a A set of actions for the world, each between 0 and 1. The return of an action array of floats is inconsistent with the demands of openai's gym.
Did I miss anything @brohrer ?

@markroxor
Copy link
Collaborator Author

Blocking #39

@brohrer
Copy link
Owner

brohrer commented Sep 25, 2018

At the moment, self.postprocessor.convert_to_actions() returns only 0 or 1 values, but in the future they may return float actions valued between 0 and 1. Worlds should be able to handle these, even if it is just to round them first. The documentation in becca may not all be consistent with this yet.

I'm not familiar with what Gym worlds expect. Is it very consistent across worlds? I expect there will need to be some connecting code to get them to talk smoothly. Translating actions into the expected format will probably be part of that.

@markroxor
Copy link
Collaborator Author

Can you please run this gist - https://gist.github.com/markroxor/c50a6bfc69da001180374a9e977ac21a (install gym first - pip install gym). The actions parameter which is fed to World.step is a float.

gym's environment expects an index of the action at each step. It can only perform one action at a time. Yes we would need some connecting code.

I think we need the api doc to proceed with the integration since the code docstrings cannot be relied upon.

@brohrer
Copy link
Owner

brohrer commented Sep 26, 2018

Nice work putting this connecting code together. I agree. I ran the gist and saw the same result and reached the same conclusions. I'm picturing some lines in init that, given a Gym world name, uses introspection to figure out the nature of the actions and the observations (Box and Discrete) and convert them to and from sensors and actions for becca.

An n-valued Discrete Gym would correspond to n sensors or actions. So would an n-dimensional Box.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants