-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
do not run apt update or simulate apt dist-upgrade #181
do not run apt update or simulate apt dist-upgrade #181
Conversation
dc995a9
to
3b17a4d
Compare
additional data point, output is similar before/after here:
the script also runs almost 4 times faster, naturally... of course, the above is a rather dumb example, with no pending upgrades... i'll check fleet wide if there are changes shortly. update: well. we don't have any pending upgrades right now, so that's zero fleet-wide, but i'm pretty confident this will Just Work, especially given jak's comment in the Debian bug tracker: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1028212#62 |
I was never a fan of this script updating the cache, and suggested on at least one occasion that we'd be better off running the script as an unprivileged user by default, just so that it wouldn't try to update the cache. OTOH, there will be people who run this script without asynchronous updating of the cache (via e.g. apticron or unattended-upgrades). I'd sorta prefer to put this cache-updating code behind a command-line switch (disabled by default), for such cases. And to address the issue of it hitting the repo servers too frequently, how about something like taking modulo 86400 of the current Unix timestamp, to ensure it only hits the repos once per day, regardless of how often it is run? |
We use the `pkgcache.bin` modification time as a heuristic, but there might be a better way to do this. We also silently ignore errors when pulling that file to avoid broken systems from breaking the other metrics. I believe this will lead to a null metric which is probably what we want anyway. A possible improvement to this would be to individually inspect the `InRelease` files and report the mirror's timestamps as well, but that's more involved. It's also not a priority compared to this fix, because we want the update timestamp if we remove the `apt update` run (in prometheus-community#181). Closes: prometheus-community#180 Signed-off-by: Antoine Beaupré <[email protected]>
On 2023-10-13 09:56:29, Daniel Swarbrick wrote:
OTOH, there will be people who run this script _without_ asynchronous updating of the cache (via e.g. apticron or unattended-upgrades). I'd sorta prefer to put this cache-updating code behind a command-line switch (disabled by default), for such cases.
The problem with this is that it's a policy decision that's largely
disconnected from when you want to report on the actual state of the
system.
In the Debian package, for example, this script runs every fifteen
minutes. If you run apt-update every fifteen minutes on all your
machines, you are going to hurt the mirrors, and you're being a bad
citizen. You shouldn't do that.
We were doing this, but we didn't even *know* we were doing this. We
were just running this small metric here, surely that causes no harm?
I *really* don't think this should run apt-update. It's slow, and it's
hurting the mirrors, it's just a plain bad idea.
If people need a job to update the cache, they should make a job to
update the cache, i don't get why it would be the exporter's job to do
so.
The previous shell implementation of this, by the way, did not run apt
update either, so I consider this a serious regression of the apt update
rewrite.
I'm not completely closed to rewriting this to have a commandline flag
to enable apt update, but I think it's overkill, and I am not
particularly interested in working on this myself.
Finally, keep in mind this patch is aimed for inclusion in a Debian
stable release. The more complex we make it, the more trouble it will be
to ship there (and elsewhere).
|
I don't see that it would be too complex to simply ensure that the cache update is not performed unless explicitly opted into (for those who need it). By making the flag disabled by default, no extra changes to the .service would be needed. It would merely be a behaviour change (albeit one that could be reverted, unlike stripping out the functionality entirely). I suggested such a flag several months ago: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1028212#10 This repo has no tags, so is effectively a rolling release. Therefore what Debian stable or any other distro which packages these scripts does is up to that distro. |
On 2023-10-13 11:31:43, Daniel Swarbrick wrote:
[...]
This repo has no tags, so is effectively a rolling release. Therefore what Debian stable or any other distro which packages these scripts does is up to that distro.
I am a Debian developer. I am providing this very patch as a hotfix for
this problem in Debian specifically. If you want to make it more
elaborate, I guess it's your call, but I would suggest taking the patch
as is and adding the extra feature on top after, that way we can get
moving here... :)
Are we in agreement that, by default, we shouldn't run apt-get update
here? This is what this patch does...
|
Also, I did a bit of research and the normal way of ensuring the package list is up to date on APT systems is to enable the
I hardly see why this specific script should do that job instead... |
I think one key concept that I might not have mentioned here is that this script reports on two different tasks: updating the package list and actually upgrading pending packages. The latter can be done pretty much any time an admin passes by a server or unattended-upgrades runs, and it's worth running often to check for that. The former, however, is needless to run repeatedly because mirrors themselves (heck, ftp-master.debian.org even) do not upgrade that often (dinstall runs every 15 minutes if my memory is correct). So we really shouldn't do any action here: just like this script doesn't (and shouldn't!) perform actual upgrades, I don't think it should update the package list either. I really like the idea of having the script deliberately readonly, and I think it's going to make everything simpler if we keep it like that. But again, if you want to take this further, feel free... I just don't want to get bogged down in those implementation details right now. |
Yes, I would prefer users update the apt-cache on their own schedule and this script does not. It's much better to have the package cache run async from the script and avoid side effects. Even behind a flag, I don't think it's good to include it. There are lots of better ways to deal with updating the apt package cache. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind adding the apt config notes as a documentation comment to the top of the script?
3b17a4d
to
fddbd16
Compare
done. |
We use the `pkgcache.bin` modification time as a heuristic, but there might be a better way to do this. We also silently ignore errors when pulling that file to avoid broken systems from breaking the other metrics. I believe this will lead to a null metric which is probably what we want anyway. A possible improvement to this would be to individually inspect the `InRelease` files and report the mirror's timestamps as well, but that's more involved. It's also not a priority compared to this fix, because we want the update timestamp if we remove the `apt update` run (in prometheus-community#181). Closes: prometheus-community#180 Signed-off-by: Antoine Beaupré <[email protected]> Co-authored-by: Ben Kochie <[email protected]>
This is causing all sorts of problems. The first of which is that we're hitting our poor mirrors every time the script is ran, which, in the Debian package configuration, is *every 15 minutes* (!!). The second is that this locks the cache and makes this script needlessly stumble upon a possible regression in APT from Debian bookworm and Ubuntu 22.06: https://bugs.launchpad.net/ubuntu/+source/apt/+bug/2003851 That still has to be confirmed: it's possible that `apt update` can hang for a long time, but that shouldn't concern us if we delegate this work out of band. I also do not believe actually performing the `dist-upgrade` calculations is necessary to compute the pending upgrades at all. I've done work with python-apt for other projects and haven't found that to be required: the cache has the necessary information about pending upgrades. Closes: prometheus-community#179 Signed-off-by: Antoine Beaupré <[email protected]> Co-authored-by: Daniel Swarbrick <[email protected]>
e3a34c0
to
524dbae
Compare
thanks for the review and (tacit?) approval, @dswarbrick :) i integrated your changes in the commit with credit in the commitlog. |
i think this is now ready to merge, @SuperQ can you push the magic button? :) |
This is causing all sorts of problems. The first of which is that we're hitting our poor mirrors every time the script is ran, which, in the Debian package configuration, is every 15 minutes (!!).
The second is that this locks the cache and makes this script needlessly stumble upon a possible regression in APT from Debian bookworm and Ubuntu 22.06:
https://bugs.launchpad.net/ubuntu/+source/apt/+bug/2003851
That still has to be confirmed: it's possible that
apt update
can hang for a long time, but that shouldn't concern us if we delegate this work out of band.I also do not believe actually performing the
dist-upgrade
calculations is necessary to compute the pending upgrades at all. I've done work with python-apt for other projects and haven't found that to be required: the cache has the necessary information about pending upgrades.Closes: #179