-
Notifications
You must be signed in to change notification settings - Fork 143
use the priority of kube-batch #209
base: master
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Hi @YesterdayxD. Thanks for your PR. I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed (or fixed any issues), please reply here with What to do if you already signed the CLAIndividual signers
Corporate signers
ℹ️ Googlers: Go here for more info. |
pkg/apis/pytorch/v1/types.go
Outdated
@@ -69,6 +69,10 @@ type PyTorchJobSpec struct { | |||
// "Worker": PyTorchReplicaSpec, | |||
// } | |||
PyTorchReplicaSpecs map[PyTorchReplicaType]*common.ReplicaSpec `json:"pytorchReplicaSpecs"` | |||
|
|||
//添加判断优先级的属性 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use English here.
pkg/apis/pytorch/v1/types.go
Outdated
|
||
//添加判断优先级的属性 | ||
//add PriorityClassName | ||
PriorityClassName string `json:"priorityClassName,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PriorityClassName string `json:"priorityClassName,omitempty"` | |
PriorityClassName *string `json:"priorityClassName,omitempty"` |
Since it is optional, we can define it as a pointer.
_, err := pc.SyncPodGroup(job, minAvailableReplicas) | ||
priorityClassName:=getPriorityClassName(job) | ||
//_, err := pc.SyncPodGroup(job, minAvailableReplicas) | ||
_, err := pc.SyncPodGroupTest(job, minAvailableReplicas,priorityClassName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why should we use SyncPodGroupTest instead of SyncPodGroup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I add these codes in SyncPodGroupTest:
Spec: v1alpha1.PodGroupSpec{ MinMember: minAvailable.IntVal, PriorityClassName: priorityClassName, },
the name of this function is inappropriate,it is used to test my idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gaocegege Do I detele my SyncPodGroupTest function and move the code my wrote into original SyncPodGroup function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got it.
pkg/controller.v1/pytorch/status.go
Outdated
@@ -69,6 +69,7 @@ func (pc *PyTorchController) updateStatusSingle(job *pyv1.PyTorchJob, rtype pyv1 | |||
|
|||
// Expect to have `replicas - succeeded` pods alive. | |||
commonType := common.ReplicaType(rtype) | |||
//expected是成功的判断标志,等于0时,成功的数量等于副本数,认为成功 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use English here
@gaocegege Thank you.I will change the code you mentioned |
pkg/controller.v1/pytorch/controller.go:1::warning: file is not goimported (goimports) pkg/controller.v1/pytorch/job.go:1::warning: file is not goimported (goimports) pkg/controller.v1/pytorch/job.go:221:2:warning: should merge variable declaration with assignment on next line (S1021) (staticcheck) Thanks PS, please fix the linting issues. /ok-to-test |
I don't sure that what happen the goimported issue. |
@YesterdayxD Then you can address the comments above, I can have a look at the issues. |
CLAs look good, thanks! ℹ️ Googlers: Go here for more info. |
@gaocegege I fixed the issue about goimports :) |
@@ -216,3 +216,8 @@ func getTotalFailedReplicas(job *pyv1.PyTorchJob) int32 { | |||
} | |||
return totalFailedReplicas | |||
} | |||
|
|||
func getPriorityClassName(job *pyv1.PyTorchJob) string { | |||
priorityClassName := *(job.Spec.PriorityClassName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There may be an runtime error: invalid memory address or nil pointer dereference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to remove the bracket
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right? @gaocegege
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No I think you need to check if it is nil.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got it.
@YesterdayxD: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
I add the priority of kube-batch in pytorch-operator. I changed the code of tf-operator and kubebatch in vender dir,because their code is not latest.