-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve case-sensitive path comparison #20
Comments
Since @Kataane asked a question about the "file-system-aware" comparer in #84, I decided to elaborate on it here. You see, in the real world, there is no such thing as a "case-sensitive operating system". There is a "case-sensitive path", or a "subtree", if you will. So, in the harsh reality, each path on the disk has its own comparison rules! On Windows, you can control this on per-path basis using On macOS there are some other crazy ways to switch this, and on Linux, this is obviously at least a per-mount point thing (as most common drivers try to support Windows case-insensitivity natively). The third path comparer would request this information from the actual file systems that are inspected, during path comparison, and use it when needed. In particular, let's imagine this scenario: you are on Windows, and have the following directory structure:
And our comparer is asked a question: are paths I imagine it should work like this:
So, as the result of comparing paths Obviously, this will require quite a lot of work from us, and it will be quite slow in practice (magnitudes slower than the default comparers). But I believe it is a "must have" feature of a file system path library. |
I suggest the following changes.
Textual only. This one should operate on strict string equality, and named accordingly (something like
StrictStringPathComparer
?).Platform-default comparer: should implement case-sensitive comparison on Linux, and case-insensitive (probably with corresponding relaxations related to Unicode normalization) on Windows and macOS.
File-system-aware comparer: for each compared path component, should compare the actual case sensitivity of the corresponding file system subroot. For non-existent paths, it should use the platform-dependent policy of calculating the case sensitivity for new subdirectories (is it normally taken from the parent directory?).
This one is obviously IO-intensive, so I'm thinking of introducing some sort of "sensitivity cache" that'd store the lists of checked paths and subtrees in a trie data structure, and would be used for one or multiple operations (probably one per comparer instance, with the ability of manual reset).
The text was updated successfully, but these errors were encountered: