Skip to content

Official Implementation of the retrofitted WebShop environment in "Towards Unified Alignment Between Agents, Humans, and Environment"

Notifications You must be signed in to change notification settings

AgentForceTeamOfficial/UA2-WebShop

Repository files navigation

UA2-WebShop

This repo contains the implementation of UA2-WebShop, the retrofitted WebShop environment proposed in Towards Unified Alignment Between Agents, Humans, and Environment. Based on the original WebShop environment, UA2-WebShop implements a testbed for agents to align with human intentions, environmental dynamics, and self-constraints simultaneously.

An online demo of the environment can be accessed http://49.232.144.86:5000/.

If you find this work useful in your research, please cite:

@article{yang2024towards,
   title = {Towards Unified Alignment Between Agents, Humans, and Environment},
   author = {Yang, Zonghan and Liu, An and Liu, Zijun and Liu, Kaiming and Xiong, Fangzhou and Wang, Yile and Yang, Zeyuan and Hu, Qingyuan and Chen, Xinrui and Zhang, Zhenhe and Luo, Fuwen and Guo, Zhicheng and Li, Peng and Liu, Yang},
   journal={arXiv preprint arXiv:2402.07744},
   year = {2024}
}

Table of Contents

Overview

UA2-WebShop consists of 10 human user profiles, each of which is followed with an instruction group of 50 consecutive shopping assistance tasks. Constructed upon the WebShop environment, UA2-WebShop inherits the original 1.18 million real-world products. In the UA2-WebShop environment, an LLM-powered agent needs to analyze the user's initial profile, tracking and inferring a series of shopping instructions. In UA2-WebShop, two types of reranking mechanisms are implemented, including DPP based and collaborative filtering based reranking, As a result, in this environment, the agent needs to cope with changeable ranks of searched item lists as well.

UA2-WebShop is a reflection of two lines of alignment covered in the position paper: alignment with human intentions and alignment with environmental dynamics. The third line of alignment, alignment with self-constraints, is tracked and computed in the runtime environment, the implementation of which can be found in the sibling repo.

Setup

The setup of UA2-WebShop follows the setup of the original WebShop. After the installation of required packages and the preparation of original data, download the data for UA2-WebShop and put them in the data directory.

The data for UA2-WebShop can be downloaded here:

  • new_tasks_ins_v4.json consists of the constructed new tasks: 10 human user profiles in total, each of which is followed with an instruction group of 50 consecutive tasks.
  • items_id2idx.json indicates the index of each shopping item.
  • embeddings folder contains the embedding vectors for each items pre-computed with a LLama-2 13b base model. The embedding data is used in the DPP-based reranking mechanism (described in this paper).
  • prime_user_score.json stores the preference weights of 'other users' (simulated by ChatGPT role playing) for collaborative filtering based reranking.

Usage

The UA2-WebShop environment is deployed in the html mode. Launch the UA2-WebShop webpage:

> ./run_prod.sh

The site should then be viewable in the browser at http://localhost:5000/. As different users need to be differentiated for the individual tracking of the clicking history, the landing page requires inputting username http://localhost:5000/login :

One needs to enter a unique username to arrive at the task page http://localhost:5000/user_0/task_0:

After entering the username, the name is kept in the session for the identification of an individual user. The name will be dropped once the session is over or the browser cache is reset.

Flags (e.g., --log, --attrs) are inherited from the original WebShop. UA2-WebShop introduces the following new flag:

  • --rerank: Include this flag to enable reranking (a weighted average effect of the two reranking algorithms) in UA2-WebShop. The balancing weight is set to be 0.8 for collaborative filtering and 0.2 for DPP - This is the setting used in our paper, and can be altered in the 439-th line of web_agent_site/engine/engine.py.

Authors

The UA2-WebShop project is a collaborative effort of Logo AgentForce Team. The task design and the environment construction of UA2-WebShop is initiated and led by Zonghan Yang ([email protected]). The following members are listed in alphabetical order, with main contributions listed:
  • Xinrui Chen implemented the novel pages in UA2-WebShop (esp. the user login mechanism), and was also in charge of data gathering and processing of the simulated preferences.
  • Yile Wang led and implemented the ChatGPT role-playing part to gather simulated preference data for collaborative filtering.
  • Fangzhou Xiong implemented both the DPP-based and the collaborative filtering based reranking mechanisms.
  • Zhenhe Zhang processed the simulated preference data, and was also in charge of the response behavior of the website in terms of time delays.
  • Kaiming Liu, Zeyuan Yang, as well as all the aforementioned members contributed to the validity checks of task construction in UA2-WebShop.
  • Zhicheng Guo, Qingyuan Hu, An Liu, Zijun Liu, and Kaiming Liu validated and provided feedbacks from the perspective of LLM-powered agent baselines implementation.
  • Chi Chen, Fuwen Luo, Ziyue Wang, Siyu Wang, and Xiaolong Wang contributed to the human evaluation of ChatGPT-simulated preference data.

The project is advised by Peng Li ([email protected]) and Yang Liu ([email protected]), and is heavily inspired by the original WebShop environment.

Contributions

We look forward to all kinds of suggestions from anyone interested in our project with whatever backgrounds! Either PRs, issues, or leaving a message is welcomed. We'll be sure to follow up shortly!

About

Official Implementation of the retrofitted WebShop environment in "Towards Unified Alignment Between Agents, Humans, and Environment"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published