-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Report disk usage, fix low disk usage recovery, update EVE overhead limits to avoid running out #3667
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #3667 +/- ##
=======================================
Coverage 17.51% 17.51%
=======================================
Files 3 3
Lines 805 805
=======================================
Hits 141 141
Misses 629 629
Partials 35 35 ☔ View full report in Codecov by Sentry. |
2db4646
to
1df5061
Compare
37c8734
to
49902b4
Compare
6608f86
to
7e2d51e
Compare
@OhmSpectator @rouming I've taken care of all of the comments (and sanitized the set of commits). Please review and approve as appropriate. Also, we should presumably use a directory in /persist for the eve-info to make it easier to clean that up. I can do a separate PR for that if folks agree. |
@eriknordmark, as far as I understand, they want to clean up some space during the tests. Initially, they only wanted to clean up the logs. I recommended removing the contents of the directories mentioned in the Yep, we can discuss it briefly during the call on Thursday (the Community Call) or on Friday (we have a 1-to-1 call scheduled, as I remember). |
return nil | ||
} | ||
|
||
info, err := os.Stat(path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw. @andrewd-zededa made a change recently where he limited the number of Stat calls since it apparently can be expensive to call on each file: #4065
I don't know if the same consideration needs to apply for FindLargeFiles as well..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is my only concern as well. os.Stat
is expensive, as you have to read each file inode. Unfortunately, there is no easier way to do it that I can think of. The way ext4 is built, the directory entries include only the name of the entry and the type. If you want the size, you need to read the inode, for which you need os.Stat()
.
As far as I can see #4065 just saves by not calling os.Stat()
on files outside of a given path, which I think this does already.
If you really want to get fancy, I have a library that knows how to read ext4 natively. It can load the entire inode table into memory, and you can use that to replace the os.Stat()
for each one with an in-memory lookup. It won't save you on the directory tree reading, but it will save you all of the os.Stat()
calls. It doesn't have a public interface for doing that, but I don't mind adding it.
I would suggest that we put this through its paces, including performance test. If that becomes an issue, worry about it then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're only doing this in unknown directories in /persist - all of the known places are in the exclude paths.
This is to catch somebody manually e.g., putting huge tar/iso/img files in /persist/foo.iso or /persist/bar/foo.iso.
Thus the number of things we stat should be very small unless somebody manually dumps a huge directory tree manually into /persist.
Let me get a count on the number of calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's that small, then don't worry too much about it. And definitely don't use that library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@milan-zededa @deitch I checked a default freshly installed EVE device and there are only two files which get Stat'ed in this loop (/persist/SMART_details.json and /persist/SMART_details_previous.json). And the FindLargeFiles is called every 20 seconds. So this doesn't seem to be a risk for load.
But it was a good idea to check this so thanks for the comment!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just minor comments
Let's get test results
We do have some regression here:
Edit: I see this failed already here: #4115 |
Signed-off-by: eriknordmark <[email protected]>
Signed-off-by: eriknordmark <[email protected]>
Signed-off-by: eriknordmark <[email protected]>
Signed-off-by: eriknordmark <[email protected]>
Signed-off-by: eriknordmark <[email protected]>
Signed-off-by: eriknordmark <[email protected]>
Signed-off-by: eriknordmark <[email protected]>
Signed-off-by: eriknordmark <[email protected]>
Signed-off-by: eriknordmark <[email protected]>
Signed-off-by: eriknordmark <[email protected]>
Signed-off-by: eriknordmark <[email protected]>
This PR addresses several issues related to trying to use "all" storage on the device. They include:
Separately we should probably also make it so that the system containerd can run without needing to allocate space in /persist.
If something uses up too much space in /persist, the console output during boot will include something like this:
Free space in /persist: 2369 MBytes
Free space in /persist is only 2369 hence below the limit 4096 MBytes
Free space in /persist after clearing /persist/log: 24917 MBytes
Free space in /persist after clearing /persist/log and 2s sleep: 24917 MBytes
Free space in /persist after recovery: 24917 MBytes
[The sleep is needed due to ZFS delayed deletes. The above output is with an ext4 /persist.]