-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
failed processing entk session #108
Comments
The utilization plots were only tested with plain RP sessions. In the error above, the code seems to stumble over non-RP entities (such as RE tasks), whose UID cannot be correctly parsed. I think that RA's `bin/rp_inspect/plot_util.py should, after creation of the session, filter for all RP entities (and thus filter out all other entities (around line 107) |
Thanks Andre for the comment. I find I interpret the filtering is like this (with
I will try to replicate the same steps with rp only for plotting purpose but I have a question. Does gpu resource affect analytics? I believe it wouldn't but |
I see. Hyungro, can you provide me the session so that I can try to reproduce? |
re.session.login1.hrlee.018273.0000.tar.gz |
re.session.login1.hrlee.018273.0000.agent.tar.gz |
This problem should be gone when using radical-cybertools/radical.pilot/pull/2032. @lee212 implied on slack that this is not the case: could you please send the stack you are using? This stumbles still over the |
My test runs show that entk is missing for
|
For the trace purpose, the but the values look like:
this looks correct, which is radical.pilot, but when I see radical.entk:
|
Why I don't see any of the events in the entk profiling? For example,
whereas rp shows:
As long as Maybe this is related to |
I am not sure I understand well the issue, apologies. |
analytics try to filter these states and events for plots, at least for rp. If entk does not have, analytics need to use different logics or look for other states/events that entk provides. I assume that (from your comment "you should get EnTK states ..."), my session provides possible EnTK states according to this: https://github.com/radical-cybertools/radical.entk/blob/devel/src/radical/entk/states.py. |
@lee212 : can you please send me a small reproducer so that I can see how you produced the event lists? |
|
That is not one of the sessions attached here I think? Do you mind adding it to the ticket? Thanks! |
states like #!/usr/bin/env python3
import radical.analytics as ra
sid = 're.session.login1.hrlee.018273.0000'
session = ra.Session.create(sid, 'radical.entk')
pilots = session.get(etype='pilot')
for e in pilots[0].events:
print('event: ', e[1])
for e in pilots[0].events:
print('state: ', e[5]) with the session $ ./t.py | sort | uniq -c
1 event: bootstrap_0_start
484 event: cmd
5 event: get
1 event: hostname
1 event: put
1 event: rp_install_start
1 event: rp_install_stop
1 event: staging_in_start
1 event: staging_in_stop
6 event: state
1 event: submission_start
1 event: submission_stop
2 event: sync_rel
2 event: ve_activate_start
1 event: ve_activate_stop
1 event: ve_setup_start
1 event: ve_setup_stop
494 state:
1 state: CANCELED
1 state: NEW
1 state: PMGR_ACTIVE
10 state: PMGR_ACTIVE_PENDING
1 state: PMGR_LAUNCHING
3 state: PMGR_LAUNCHING_PENDING which seems like the expected set of states and events for a pilot. I guess I would need to look at your specific session to understand what's happening... |
This is interesting, same script I ran and the result looks like:
|
Any of the plots seems valid by |
Are you really sure that the session dir contains the agent profiles under 'pilot.0000/'? If so, can you please attach an tarball of the dir again? Thanks. |
No, I don't think I had it, so I guess |
Two updates:
I keep searching where this difference is created. |
Can this be the one removing Because I find resource_details exist at |
@lee212 , @andre-merzky can this be closed now? Did we find a solution for |
|
The |
This might or might not be related to py3 transition with entk but the session data to plot analytics is getting errors.
The text was updated successfully, but these errors were encountered: