There are 2 ways to access the tasks:
-
Use the
file://
protocol (Recommended): openminiwob-sandbox/html/miniwob/
in the browser.-
The URL should now be something like
file:///path/to/web-agents/miniwob-sandbox/html/miniwob/
-
This should show the directory listing of all task HTML files.
-
-
Run a simple server: go to
miniwob-sandbox/html/
and run./http-server
.- The tasks should now be accessible at
http://localhost:8080/miniwob/
- The port can be specified like this:
./http-server -p 8765
- The tasks should now be accessible at
-
Start the recording server:
# Create an output directory mkdir out/ ./record.py out/
-
Append
?record=true
to the URL of the task you want to record. For example, for theclick-test
task, go tofile:///path/to/web-agents/miniwob-sandbox/html/miniwob/click-test.html?record=true
-
To view the results, open
viewer/viewer.html
while the recording server is running. The URL should be likefile:///path/to/web-agents/miniwob-sandbox/html/viewer/viewer.html
This version of MiniWoB incorporates a few additional JavaScript utilities.
Set the global random seed of the environment. The optional argument seed
can be an object.
Returns a nested object containing information about the current DOM states.
The returned object corresponds to the <body>
element. Its children can be accessed under the children
field.
In Python, the step
method in MiniWoBInstance
calls this function to build the MiniWoBState
.
Each visible DOM element is converted into an object with the following fields:
tag
(string): Tag name- For normal elements, this is the uppercased tag name (e.g.,
"DIV"
) - For
<input>
elements, the input type is appended (e.g.,"INPUT_text"
) - Each non-empty text node is converted into pseudo-elements with tag
"t"
, where each pseudo-element represents one line of text. However, if the text node is the only child of the parent. The text pseudo-element is not created, and its text is assigned to the parent element instead.
- For normal elements, this is the uppercased tag name (e.g.,
ref
(number): Reference number- Within each episode, the
ref
number of the same object stays the same - For the same random seed, the
ref
number of the same object should be the same ref
for normal elements start from 1, whileref
for text psuedo-elements counts down from -1
- Within each episode, the
children
(list): Recursive list of objects corresponding to the childrenleft
,top
,width
,height
(number): Geometry of the elementid
(string): Element'sid
classes
(string): Element'sclass
es (space-separated)bgColor
,fgColor
(string): Background and foreground colorsfocused
(boolean): Indicates if the element is being focused ontampered
(boolean): Indicates if the element is tampered (clicked, focused, typed, etc.)value
: For<input>
, this contains the input value- For
checkbox
andradio
types, this contains a boolean whether the input is selected - For other input types, this contains a text value
- For
text
(string): For child nodes and text pseudo-elements, this contains the text content
Can be called on the result of getDOMInfo()
to get a flattened representation.
Useful for debugging in Chrome console.
Click on an element regardless of its location and visibility.
The argument ref
is the ref value generated by the previous call to getDOMInfo()
.
Visualize the attention weights on the screen.
The argument values
is a 2D array of shape 20 × 20.
Each demonstration is saved as a JSON file. The root object generated by core/record.js
contains the following fields:
taskName
(string)utterance
(string)reward
(number): Reward as defined by the taskrawReward
(number): 1 if succeeded and -1 if failedstates
: a list of state objects- One state is recorded for the initial state
- Two states are recorded for each event, one before the event resolves and one after the event resolves
Each state object has the following fields:
time
(number): Time elapsed since the episode startedaction
: An action-specific object (not present for the initial state) with the following common keys:type
(string)timing
(number): theeventPhase
property of the JS event object. This is 1 before the event resolves (capturing state) and 3 after the event resolves (bubbling state).
dom
: The DOM info as generated bygetDOMInfo()
- The event target will have a special key
recordingTarget
set totrue
.
- The event target will have a special key