Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vine: changes tasks_running to tasks_on_workers for display tables #3499

Merged

Conversation

btovar
Copy link
Member

@btovar btovar commented Sep 19, 2023

Proposed changes

tasks_on_workers is accurate for what the manager knows. tasks_running depends on worker messages that may be stale.

Post-change actions

  • make test Run local tests prior to pushing.
  • make format Format source code to comply with lint policies. Note that some lint errors can only be resolved manually (e.g., Python)
  • make lint Run lint on source code prior to pushing.
  • Type Labels Select github labels for the type of this change: bug, enhancement, etc.
  • Product Labels Select github labels for the product affected: TaskVine, Makeflow, etc.
  • PR RTM Mark your PR as ready to merge.

@dthain
Copy link
Member

dthain commented Sep 20, 2023

Hmm, I'm not sure if this has the intended effect. Here is what I see:

1 - The manager knows how many tasks are in the RUNNING state (sent to the worker) by just looking at itable_size(q->running_table). That is generally what we mean by "running" throughout the code.
https://github.com/cooperative-computing-lab/cctools/blob/master/taskvine/src/manager/vine_manager.c#L5177
2 - tasks_on_workers is the sum of tasks in RUNNING and WAITING_RETRIEVAL states.
3 - s->tasks_running is the sum of w->tasks_running which is the (stale) information received from workers about processes actually executing at the worker via asynchronous info tasks_running messages.

Currently # 3 is the thing displayed to users in vine_status. I think you are saying that the user really wants to see # 1. So the more direct solution would be to change vine_get_stats to report itable_size(q->running_table) as the value tasks_running, instead of running around and adding up all the worker values.

What do you think?

@btovar
Copy link
Member Author

btovar commented Sep 20, 2023

Ah, yes, directly looking at q->running_table would be better.

@btovar
Copy link
Member Author

btovar commented Sep 20, 2023

Actually, the pr should be ready. Note that w->current_tasks is only used when asking for the information per worker. For the rest, tasks_on_workers already is set from q->running_table in get_stats.

@dthain
Copy link
Member

dthain commented Sep 21, 2023

So instead of changing the status tools to display tasks_on_workers, please change the manager to simply report tasks_running = itable_size(q->running_table) instead of computing tasks_running = SUM(w->tasks_running)

@btovar
Copy link
Member Author

btovar commented Sep 21, 2023

RTM

@dthain
Copy link
Member

dthain commented Sep 25, 2023

The changes to vine_manager are fine, but I would like to keep the display fields showing tasks_running as RUNNING.

@btovar
Copy link
Member Author

btovar commented Sep 25, 2023

This pr would make it so that it shows tasks_on_workers. The tasks_running stat has always been a little bit hacky. Do you mean to keep the header RUNNING but show the tasks_on_workers value?

@dthain
Copy link
Member

dthain commented Sep 25, 2023

No, I just want the plain meaning used throughout: The reported tasks_running property should reflect the number of tasks in the RUNNING state (which is itable_size(q->running_table)) and show it to the user as RUNNING.

I understand that the meaning of "running" is a little fuzzy because it means "the manager sent it to the worker but the worker may not have received or acted on it yet". but that's true of any property in a distributed system.

@btovar
Copy link
Member Author

btovar commented Sep 25, 2023

Ok, should I remove the tasks_on_workers stat? It is too soon for deprecated stats!

@dthain
Copy link
Member

dthain commented Sep 25, 2023

No reason to remove stats reported to the catalog, they may still come in handy, I think.

@btovar
Copy link
Member Author

btovar commented Sep 25, 2023

They will have the same value, as they are both set to the size of q->tasks_running. Is that ok?

@btovar
Copy link
Member Author

btovar commented Sep 26, 2023

Sorry, they are different, my mistake. on_workers also has tasks with results.

@btovar btovar force-pushed the running_to_dispatched branch from bfa15fd to 112ebda Compare September 26, 2023 13:05
@dthain dthain merged commit cdd495a into cooperative-computing-lab:master Sep 26, 2023
6 checks passed
Jbrocket pushed a commit to Jbrocket/cctools that referenced this pull request Oct 2, 2023
@btovar btovar deleted the running_to_dispatched branch December 4, 2023 18:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants