We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
类似于AgentBoard(https://github.com/hkust-nlp/AgentBoard )。他们的环境可以直接运行而不需要容器。这可以方便非root用户使用visualagentbench。 我考虑agentbench不发布非docker版本的数据集是因为os任务是真的可能会影响运行环境的,但这几个任务貌似不涉及这个问题?
The text was updated successfully, but these errors were encountered:
@Fu-Dayuan 我们会尽量考虑同步提供一个自己从头编译环境的方案。目前看,OmniGibson,Minecraft,Mobile理论上是可以不需要docker的,WebArena天然就需要docker(网站),CSS我们观测到在不同系统下会出现截图大小/网站布局不一致的情况,可能必须使用docker统一系统配置才能确保复现。
另外,使用docker的好处在于可以进行并发评测。例如,如果OmniGibson不使用docker并发评测的话,大概需要48H才能评测完一个模型;Mobile可能需要约12H,这个可能是实际调试过程中难以接受的。
Sorry, something went wrong.
No branches or pull requests
类似于AgentBoard(https://github.com/hkust-nlp/AgentBoard )。他们的环境可以直接运行而不需要容器。这可以方便非root用户使用visualagentbench。
我考虑agentbench不发布非docker版本的数据集是因为os任务是真的可能会影响运行环境的,但这几个任务貌似不涉及这个问题?
The text was updated successfully, but these errors were encountered: