-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New telemetry agent config #813
New telemetry agent config #813
Conversation
43425c8
to
77a122a
Compare
I had a few questions on this one:
I propose that:
|
Sounds good
Sure, I don't see a reason to remove the volume.
Vector applies to Reaper and Stargate as well. |
This doesn't break anything, the yaml stays the same for Spec.Cassandra.Telemetry. The only breakage will occur if users have been adding the Mcac field to their telemetry specs for Reaper/Stargate, in which case that configuration is actually unsupported/broken anyway. |
ok then, if there's no breakage then you can go nuts. |
return nil | ||
} | ||
|
||
func (c Configurator) AddStsVolumes(dc *cassdcapi.CassandraDatacenter) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you reuse AddOrUpdateVolume()
which is in pkg/cassandra/datacenter.go
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a close look at this and wasn't quite sure that it was appropriate. I'm open minded here, but it seems that the function you're referring to works on the datacenterConfig
type, whereas we currently work directly on the cassDC and hook into the reconciler at that level.
If you want me to investigate a change of this nature in more detail, I'm happy to do so, but it is a fairly deep re-write.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The advantage is that using that function does overwrite any existing Volume
which might already exist. But also, we don't usually provide guarantees in our controllers that a manually modified object will be self-healed back to the desired state. This is in line with in-built k8s controllers like the STS controller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems that the function you're referring to works on the datacenterConfig type
Right, and it actually makes no sense, we could have it work on a PodTemplateSpec.
I think we have an in flight PR that will actually change the function to get it to work on the PodTemplateSpec, which will allow to use it for Deployments as well.
Not a blocker for me, let's refactor it in the future.
I don't quite know what's happening here, but for some reason using @burmanm I might need to update your metrics agent again I'm sorry :( |
e910a8a
to
ecf5348
Compare
Can I draw any reviewer's attention to this block, which is the default location for the configuration file, and the default configuration itself. Is this the way we want it set up?
|
Codecov Report
@@ Coverage Diff @@
## main #813 +/- ##
==========================================
+ Coverage 56.30% 56.65% +0.35%
==========================================
Files 96 98 +2
Lines 9547 9671 +124
==========================================
+ Hits 5375 5479 +104
- Misses 3685 3703 +18
- Partials 487 489 +2
|
This is ready to go as far as I can see. The only areas where we may want changes are:
|
210bde0
to
38b4167
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may need to rebase again to get the fix on the integration tests.
The PR looks good, there are just a few comments to remove and a couple minor comments.
} | ||
cmObjectKey := types.NamespacedName{Name: c.Kluster.Name + "-metrics-agent-config", | ||
Namespace: c.Kluster.Namespace} | ||
labels.SetManagedBy(desiredCm, cmObjectKey) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's no need to set the object as "Managed by" if it's "part of" the cluster.
Managed by is used to monitor externally created objects so that reconcile triggers when they are modified.
Here the object is definitely created by the operator. Right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adejanovski yes the object is definitely created by the operator. Once the tests are passing I'll try removing the "Managed by" labels, and see if the tests keep passing.
return nil | ||
} | ||
|
||
func (c Configurator) AddStsVolumes(dc *cassdcapi.CassandraDatacenter) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems that the function you're referring to works on the datacenterConfig type
Right, and it actually makes no sense, we could have it work on a PodTemplateSpec.
I think we have an in flight PR that will actually change the function to get it to work on the PodTemplateSpec, which will allow to use it for Deployments as well.
Not a blocker for me, let's refactor it in the future.
@adejanovski do you want me to just risk merging this and see what happens? Or should I try to figure out some way to debug manually? |
056d92f
to
51dc62c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Miles-Garnsey, please merge the PR after correcting the default path for the metrics config file.
… struct field from method.
81ccfe5
to
427a4df
Compare
(From offline) We hit an issue where older versions of Cassandra caused test failures because jackson XML's methods changed between versions, and different Cassandra 3.11.x versions used different Jackson versions. To move things forward, I have updated all tests to use Cassandra 3.11.14, which should be compatible with the new metrics agent. I will leave it to the rest of the team (ping @burmanm and @adejanovski) to decide whether(a) we want to continue supporting older Cassandra versions with the new metrics agent, or whether (b) we will make the call that older patch versions of Cassandra cannot benefit from the new agent. In general, the changes in this PR will be compatible with either whatever scenario (since we are just adding a file to a directory in |
Remove `ManagedBy` labels.
This reverts commit fb50687.
SonarCloud Quality Gate failed. 0 Bugs No Coverage information |
What this PR does:
Adds configuration functionality for the new telemetry agent.
Which issue(s) this PR fixes:
Fixes #815
Checklist