-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stateful deployments: find feasible node for sticky host volumes #24558
base: dynamic-host-volumes
Are you sure you want to change the base?
Conversation
e2cd778
to
3686951
Compare
89c9b03
to
6e8c50f
Compare
6e8c50f
to
49428c3
Compare
bef9714
to
8c3d8fe
Compare
bd50ea9
to
a450389
Compare
a450389
to
7ddcb3d
Compare
1f8c378
to
a447645
Compare
196d624
to
ff3a794
Compare
c4f5d9a
to
43f19f8
Compare
scheduler/feasible.go
Outdated
// belonging to the job in question that are sticky, have the | ||
// volume IDs that match what the node offers and should not | ||
// end up in this check? | ||
allocs, err := h.ctx.ProposedAllocs(n.ID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only other place we use ctx.ProposedAllocs
for feasibility checking is in the distinct hosts checker, because in that case we need to know about all the allocations on the node, including those that might already exist. Do we really want to look at the old allocs here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. I'm just wondering what's the best way to get the full allocation data from here? hasVolumes
has access to the state store, but it's unaware of allocation ID or namespace. I could just get all the allocs from the state store and filter by node/terminal status etc? I thought perhaps looking at ProposedAllocs
would be simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, these allocs are full structs.Allocation
and include the proposed allocs that aren't in the state store yet, so I guess we do want to use this list as our starting point. But I'm pretty sure this list includes allocs not in the job, which might not have volume requests at all. But maybe that's ok as long as we're checking for the empty slice?
Having a test case that covers unrelated allocs or allocs that need different volumes would go a long way towards building our confidence that this is the right approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4b3f06c introduces a new approach, based on allocation IDs.
name string | ||
node *structs.Node | ||
expect bool | ||
}{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should include a case for a sticky volume request that hasn't been previously claimed.
Co-authored-by: Tim Gross <[email protected]>
137085b
to
d3bece0
Compare
@tgross I addressed your comments. Here's what we do now:
I added some tests and I think this is going the right direction. What do you think? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is starting to come together @pkazmierczak. If you can, it'd be worth taking a moment to building it and doing some bench testing with real deployments, alloc failures, etc. to make sure we're not missing cases where the volumes should be getting handed off.
|
||
} else { | ||
h.volumeReqs = append(h.volumeReqs, req) | ||
h.volumeReqs = append(h.volumeReqs, &allocVolumeRequest{hostVolumeIDs: allocHostVolumeIDs, volumeReq: req}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're assigning the same []string
to every request, why not have it on the HostVolumeChecker
instead of on individual requests like this? It's only copying around the trio of pointers for a slice one or two extra times but it grabs my attention as something I'm maybe missing?
@@ -349,7 +351,7 @@ func (s *SystemStack) Select(tg *structs.TaskGroup, options *SelectOptions) *Ran | |||
s.taskGroupDrivers.SetDrivers(tgConstr.drivers) | |||
s.taskGroupConstraint.SetConstraints(tgConstr.constraints) | |||
s.taskGroupDevices.SetTaskGroup(tg) | |||
s.taskGroupHostVolumes.SetVolumes(options.AllocName, s.jobNamespace, tg.Volumes) | |||
s.taskGroupHostVolumes.SetVolumes(options.AllocName, s.jobNamespace, tg.Volumes, options.AllocationHostVolumeIDs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This raises a question: is "sticky" even meaningful for system and sysbatch jobs? It is because of CSI volumes, nevermind 😁
This changeset implements node feasibility checks for sticky host volumes.