[NOT FOR MERGE] Adjust prompt to use view command #5506

enyst · 2024-12-10T07:12:23Z

Give a summary of what the PR does, explaining any non-trivial design decisions

This is the prompt adjustment I used, with the purpose that the LLM uses more the view tool for directories, which is part of its file_editor tool, than other options (ls -R /workspace or ls -la /workspace).

This is used for the initial exploration of the repository a lot.

I think it would be interesting to eval this after Ryan's fix is merged in main.

Edited to add: in the event stream of CodeAct 2.2 swe-bench run, the agent uses ls -R /workspace a lot, and ls -la sometimes.

To run this PR locally, use the following command:

docker run -it --rm   -p 3000:3000   -v /var/run/docker.sock:/var/run/docker.sock   --add-host host.docker.internal:host-gateway   -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:44a6bd9-nikolaik   --name openhands-app-44a6bd9   docker.all-hands.dev/all-hands-ai/openhands:44a6bd9

enyst · 2024-12-10T14:38:43Z

There we go:

THOUGHT: I'll help you implement the necessary changes to fix the issue with `django.utils.http.parse_http_date`. Let's follow the steps you outlined.

1. First, let's explore the repository structure:
CODE:
print(file_editor(**{'command': 'view', 'path': '/workspace/django__django__3.1'}))

OBSERVATION
[Agent Controller default] **IPythonRunCellObservation**
Here's the files and directories up to 2 levels deep in /workspace/django__django__3.1, excluding hidden items:
/workspace/django__django__3.1
/workspace/django__django__3.1/AUTHORS
/workspace/django__django__3.1/CONTRIBUTING.rst
/workspace/django__django__3.1/FUNDING.yml
/workspace/django__django__3.1/Gruntfile.js
/workspace/django__django__3.1/INSTALL
/workspace/django__django__3.1/LICENSE
/workspace/django__django__3.1/LICENSE.python
...

enyst · 2024-12-10T16:27:48Z

I ran 13 instances that are unresolved (0/13) in the CodeAct 2.2 results. They're all on django, and all part of the intersection of Lite with Verified.

CodeAct2.2: 0/13
Branch: 1/13.

Too little to matter, but FWIW! @xingyaoww

ryanhoangt · 2024-12-11T14:43:31Z

I'm thinking about whether we should still make this change in the prompt, as encouraging the agent to use view over ls -R can save us on tokens, hence allowing the agent to execute more steps before reaching the context limit 🤔

github-actions · 2024-12-13T11:34:06Z

Running evaluation on the PR. Once eval is done, the results will be posted.

openhands-agent · 2024-12-13T12:12:16Z

Evaluation results: ## Summary

submitted instances: 30
empty patch instances: 12
resolved instances: 8
unresolved instances: 22
error instances: 0

Empty patches were from the litellm proxy error:

2024-12-13 11:47:01,561 - ERROR - [Agent Controller default] Error while running the agent: litellm.NotFoundError: NotFoundError: OpenAIException - Error code: 404 - {'error': {'message': 'litellm.NotFoundError: AnthropicException - {"type":"error","error":{"type":"not_found_error","message":"model: *"}}\nReceived Model Group=claude-3-5-sonnet-20241022.......
 'code': '404'}}

mamoodi · 2024-12-13T13:28:14Z

Haven't automated this part yet so here ya go:
evaluation.zip

enyst · 2025-01-05T02:59:08Z

@openhands-agent Your last attempt to fix the conflicts didn't work. Please do this again: pull main into this branch and fix the conflicts.

openhands-agent · 2025-01-05T02:59:27Z

OpenHands started fixing the pr! You can monitor the progress here.

prompt to use view command

925eefe

enyst marked this pull request as draft December 10, 2024 07:12

enyst mentioned this pull request Dec 10, 2024

[Bug]: The 'view' tool command doesn't work on /workspace #5497

Closed

1 task

use aci main

12be7fc

enyst added the run-eval-m Runs evaluation with 30 instances label Dec 13, 2024

enyst added 2 commits December 13, 2024 11:20

Merge branch 'main' into enyst/test_view_depth

832311c

poetry lock

bff64d8

enyst added run-eval-m Runs evaluation with 30 instances and removed run-eval-m Runs evaluation with 30 instances labels Dec 13, 2024

Fix pr #5506: [NOT FOR MERGE] Adjust prompt to use view command

e2c6a1f

Resolve merge conflicts with main branch

0ce0ce2

All-Hands-AI deleted a comment from openhands-agent Jan 5, 2025

enyst added lint-fix and removed run-eval-m Runs evaluation with 30 instances labels Jan 5, 2025

enyst added 3 commits January 5, 2025 19:30

restore poetry lock

27d70d0

restore perms

44a6bd9

Merge branch 'main' into enyst/test_view_depth

8f86073

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NOT FOR MERGE] Adjust prompt to use view command #5506

[NOT FOR MERGE] Adjust prompt to use view command #5506

enyst commented Dec 10, 2024 •

edited by github-actions bot

Loading

enyst commented Dec 10, 2024

enyst commented Dec 10, 2024

ryanhoangt commented Dec 11, 2024 •

edited

Loading

github-actions bot commented Dec 13, 2024

openhands-agent commented Dec 13, 2024 •

edited by enyst

Loading

mamoodi commented Dec 13, 2024

enyst commented Jan 5, 2025

openhands-agent commented Jan 5, 2025

[NOT FOR MERGE] Adjust prompt to use view command #5506

Are you sure you want to change the base?

[NOT FOR MERGE] Adjust prompt to use view command #5506

Conversation

enyst commented Dec 10, 2024 • edited by github-actions bot Loading

enyst commented Dec 10, 2024

enyst commented Dec 10, 2024

ryanhoangt commented Dec 11, 2024 • edited Loading

github-actions bot commented Dec 13, 2024

openhands-agent commented Dec 13, 2024 • edited by enyst Loading

mamoodi commented Dec 13, 2024

enyst commented Jan 5, 2025

openhands-agent commented Jan 5, 2025

enyst commented Dec 10, 2024 •

edited by github-actions bot

Loading

ryanhoangt commented Dec 11, 2024 •

edited

Loading

openhands-agent commented Dec 13, 2024 •

edited by enyst

Loading