Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stateful deployments: find feasible node for sticky host volumes #24558

Open
wants to merge 28 commits into
base: dynamic-host-volumes
Choose a base branch
from

Conversation

pkazmierczak
Copy link
Contributor

@pkazmierczak pkazmierczak commented Nov 27, 2024

This changeset implements node feasibility checks for sticky host volumes.

@tgross tgross force-pushed the dynamic-host-volumes branch from e2cd778 to 3686951 Compare December 2, 2024 14:12
@tgross tgross force-pushed the dynamic-host-volumes branch from bef9714 to 8c3d8fe Compare December 3, 2024 19:11
@pkazmierczak pkazmierczak changed the title Stateful deployments stateful deployments: host volumes Dec 5, 2024
@tgross tgross force-pushed the dynamic-host-volumes branch from 1f8c378 to a447645 Compare December 9, 2024 20:27
scheduler/feasible.go Outdated Show resolved Hide resolved
scheduler/generic_sched.go Outdated Show resolved Hide resolved
@pkazmierczak pkazmierczak changed the title stateful deployments: host volumes stateful deployments: find feasible node for sticky host volumes Dec 11, 2024
@pkazmierczak pkazmierczak marked this pull request as ready for review December 11, 2024 15:18
@pkazmierczak pkazmierczak requested review from a team as code owners December 11, 2024 15:18
// belonging to the job in question that are sticky, have the
// volume IDs that match what the node offers and should not
// end up in this check?
allocs, err := h.ctx.ProposedAllocs(n.ID)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only other place we use ctx.ProposedAllocs for feasibility checking is in the distinct hosts checker, because in that case we need to know about all the allocations on the node, including those that might already exist. Do we really want to look at the old allocs here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I'm just wondering what's the best way to get the full allocation data from here? hasVolumes has access to the state store, but it's unaware of allocation ID or namespace. I could just get all the allocs from the state store and filter by node/terminal status etc? I thought perhaps looking at ProposedAllocs would be simpler.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, these allocs are full structs.Allocation and include the proposed allocs that aren't in the state store yet, so I guess we do want to use this list as our starting point. But I'm pretty sure this list includes allocs not in the job, which might not have volume requests at all. But maybe that's ok as long as we're checking for the empty slice?

Having a test case that covers unrelated allocs or allocs that need different volumes would go a long way towards building our confidence that this is the right approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4b3f06c introduces a new approach, based on allocation IDs.

scheduler/feasible.go Outdated Show resolved Hide resolved
nomad/structs/structs.go Outdated Show resolved Hide resolved
name string
node *structs.Node
expect bool
}{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should include a case for a sticky volume request that hasn't been previously claimed.

@pkazmierczak
Copy link
Contributor Author

@tgross I addressed your comments. Here's what we do now:

  • we store hostVolumeIDs instead of allocIDs like you suggested—it's much easier indeed
  • in computePlacements we either add the volume IDs to allocations or carry them over if they had them before.

I added some tests and I think this is going the right direction. What do you think?

Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is starting to come together @pkazmierczak. If you can, it'd be worth taking a moment to building it and doing some bench testing with real deployments, alloc failures, etc. to make sure we're not missing cases where the volumes should be getting handed off.

scheduler/generic_sched.go Outdated Show resolved Hide resolved

} else {
h.volumeReqs = append(h.volumeReqs, req)
h.volumeReqs = append(h.volumeReqs, &allocVolumeRequest{hostVolumeIDs: allocHostVolumeIDs, volumeReq: req})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're assigning the same []string to every request, why not have it on the HostVolumeChecker instead of on individual requests like this? It's only copying around the trio of pointers for a slice one or two extra times but it grabs my attention as something I'm maybe missing?

@@ -349,7 +351,7 @@ func (s *SystemStack) Select(tg *structs.TaskGroup, options *SelectOptions) *Ran
s.taskGroupDrivers.SetDrivers(tgConstr.drivers)
s.taskGroupConstraint.SetConstraints(tgConstr.constraints)
s.taskGroupDevices.SetTaskGroup(tg)
s.taskGroupHostVolumes.SetVolumes(options.AllocName, s.jobNamespace, tg.Volumes)
s.taskGroupHostVolumes.SetVolumes(options.AllocName, s.jobNamespace, tg.Volumes, options.AllocationHostVolumeIDs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This raises a question: is "sticky" even meaningful for system and sysbatch jobs? It is because of CSI volumes, nevermind 😁

scheduler/stack.go Outdated Show resolved Hide resolved
scheduler/generic_sched_test.go Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants