Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Follows macOS/APFS "firmlinks" even with .follow_links(false) #169

Open
hippietrail opened this issue Nov 1, 2022 · 6 comments
Open

Follows macOS/APFS "firmlinks" even with .follow_links(false) #169

hippietrail opened this issue Nov 1, 2022 · 6 comments
Labels

Comments

@hippietrail
Copy link

hippietrail commented Nov 1, 2022

macOS with APFS has a feature called "firmlinks" which are sometimes described as being between hardlinks and symlinks. They're used to make two system partitions appear like the old single partition scheme. Certain directories that live in /System/Volumes/Data/xyz are firmlinked to /xyz

Swift's standard library is aware of these and its dir walking functionality does not follow them. Rust's walkdir is not aware of them and does follow them. (Note that there's no commandline switches for mac's ls that reveal them)

I wrote similar code for both Swift and Rust. It's probably not the best, I'm just learning both languages. First argument is path the walk begins, second is a substring to match in the name of a directory to cause it to be printed out.

Rust: cargo run / LLVM

/Library/Frameworks/Xamarin.iOS.framework/Versions/15.10.0.5/LLVM
/System/Volumes/Data/Library/Frameworks/Xamarin.iOS.framework/Versions/15.10.0.5/LLVM
/System/Volumes/Data/Users/hippietrail/.vscode-insiders/extensions/ms-vscode.cpptools-1.5.1/LLVM
/System/Volumes/Data/Users/hippietrail/.vscode/extensions/ms-vscode.cpptools-1.12.4-darwin-arm64/LLVM
/System/Volumes/Data/Applications/Xcode.app/Contents/Applications/Instruments.app/Contents/PlugIns/DTLLVMBinaryAnalysisPlugin.xrplugin
/Users/hippietrail/.vscode-insiders/extensions/ms-vscode.cpptools-1.5.1/LLVM
/Users/hippietrail/.vscode/extensions/ms-vscode.cpptools-1.12.4-darwin-arm64/LLVM
/Applications/Xcode.app/Contents/Applications/Instruments.app/Contents/PlugIns/DTLLVMBinaryAnalysisPlugin.xrplugin
done

Swift: dirwalker / LLVM

Library/Frameworks/Xamarin.iOS.framework/Versions/15.10.0.5/LLVM
Users/hippietrail/.vscode-insiders/extensions/ms-vscode.cpptools-1.5.1/LLVM
Users/hippietrail/.vscode/extensions/ms-vscode.cpptools-1.12.4-darwin-arm64/LLVM
Applications/Xcode.app/Contents/Applications/Instruments.app/Contents/PlugIns/DTLLVMBinaryAnalysisPlugin.xrplugin
done

Rust code:

use std::env;
use walkdir::WalkDir;

fn main() {
    let args: Vec<String> = env::args().collect();

    let path: &str = args[1].as_str();
    let text: &str = args[2].as_str();

    if args.len() == 3 {
        for entry in WalkDir::new(path).follow_links(false)
            .into_iter()
            .filter_map(|e| e.ok())
            .filter(|e| e.file_type().is_dir())
            .filter(|e| e.file_name().to_str().unwrap().contains(text)) {

            println!("{}", entry.path().display());
        }
        println!("done");
    } else {
        println!("** usage: first arg is start directory, second is substring to look for in directory paths");
    }
}

Swift code:

import Foundation
import AppKit

let fileManager = FileManager.default

let resKeys : [URLResourceKey] = [.isDirectoryKey, .fileSizeKey, .isSymbolicLinkKey]

let startURL: URL = URL(string: fileManager.currentDirectoryPath)!

guard CommandLine.arguments.count == 3 else {
    print("** usage: dirwalker path string")
    exit(1);
}

let pathArg = CommandLine.arguments[1]
let matchArg = CommandLine.arguments[2]

if let path = pathArg.addingPercentEncoding(withAllowedCharacters: .urlQueryAllowed) {
    if let url = URL(string: path) {
        let en = fileManager.enumerator(at: url,
                                        includingPropertiesForKeys: resKeys,
                                        options: [.producesRelativePathURLs],
                                        errorHandler: { (url, error) -> Bool in
            return true }
        )!

        mainloop: for case let fileURL as URL in en {
            do {
                let rv = try fileURL.resourceValues(forKeys: Set(resKeys))
                if let d = rv.isDirectory, d {
                    
                    let filename: String = fileURL.lastPathComponent;

                    if filename.contains(matchArg) {
                        print(fileURL.relativePath)
                    }
                }
            } catch {
                print("** error 2:", error)
            }
        }
    }
}

print("done")
@BurntSushi
Copy link
Owner

I'm not sure what the expectation is here. If macOS doesn't report them as symlinks, then that's what macOS has decided: they shouldn't be regarded as symlinks.

@hippietrail
Copy link
Author

Well the programmer normally has an expectation that walking a directory only visits each directory once. But perhaps my chosen wording naturally leaves to a literalist interpretation but even then the option itself talks about "links" not about "symlinks".

Perhaps a discussion would include alternatives such as:

  • Should there also be a .follow_firmlinks() just on macOS?
  • Should the current method be renamed to .follow_symlinks()?
  • Should "links" be interpreted to mean both kinds of links?
  • Should there be some completely different option that directly means "don't traverse directories more than once?"
  • Should the documentation state clearly that .follow_links() actually just means symlinks and not links generally?

@BurntSushi
Copy link
Owner

BurntSushi commented Nov 1, 2022

With respect to naming: follow_links is short for following symlinks. It doesn't change anything about how hardlinks are handled, for example. Note that the first three words for the docs for follow_links is:

Follow symbolic links.

So the docs are already clear that it's just about symlinks.

I'm definitely not going to rename it. And renaming it just because macOS decided to introduce some new weird version of links also seems like a bad way to prioritize things.

I'm inclined to:

  • Leave this issue open.
  • Do nothing in the interim.
  • Collect feedback.
  • Observe how other tools deal with "firm" links.
  • Make a decision later.

@BurntSushi
Copy link
Owner

And also:

Well the programmer normally has an expectation that walking a directory only visits each directory once.

If that's true, then why doesn't macOS report firm links as symlinks? Like, why pin the responsibility on me here and not on macOS? They introduced firmlinks and they decided not to report them as symlinks.

@zeroflaw
Copy link

zeroflaw commented Nov 25, 2023

Out of curiosity I wanted to know how you would go about detecting a 'firmlink', seems possible using libc. You would probably want to follow the 'firmlink' (the short version) and ignore a directory that had a 'firmlink'. It's super confusing but it is doable.

#[cfg(test)]
mod tests {

    #[test]
    fn test_libc_detect_firmlink() {
        let app_path_system = std::ffi::CString::new("/System/Volumes/Data/Applications").unwrap();
        let app_path_root = std::ffi::CString::new("/Applications").unwrap();

        let fd_system = unsafe {
            libc::open(
                app_path_system.as_ptr(),
                libc::O_NONBLOCK,
                libc::O_DIRECTORY,
            )
        };
        let fd_root = unsafe {
            libc::open(
                app_path_system.as_ptr(),
                libc::O_NONBLOCK,
                libc::O_DIRECTORY,
            )
        };
        assert!(fd_system != -1);
        assert!(fd_root != -1);

        let mut buffer = vec![0; libc::PATH_MAX as usize];
        let r = unsafe { libc::fcntl(fd_system, libc::F_GETPATH, buffer.as_mut_ptr()) };
        assert!(r == 0);
        let get_path_system = std::ffi::CStr::from_bytes_until_nul(&buffer).unwrap();
        println!("(F_GETPATH) -- fd_system: {:?}", get_path_system);

        let mut buffer = vec![0; libc::PATH_MAX as usize];
        let r = unsafe { libc::fcntl(fd_root, libc::F_GETPATH, buffer.as_mut_ptr()) };
        assert!(r == 0);
        let get_path_root = std::ffi::CStr::from_bytes_until_nul(&buffer).unwrap();
        println!("(F_GETPATH) -- fd_root: {:?}", get_path_root);

        let mut buffer = vec![0; libc::PATH_MAX as usize];
        let r = unsafe { libc::fcntl(fd_system, libc::F_GETPATH_NOFIRMLINK, buffer.as_mut_ptr()) };
        assert!(r == 0);
        let get_path_nofirmlink_system = std::ffi::CStr::from_bytes_until_nul(&buffer).unwrap();
        println!(
            "(F_GETPATH_NOFIRMLINK) -- fd_system: {:?}",
            get_path_nofirmlink_system
        );

        let mut buffer = vec![0; libc::PATH_MAX as usize];
        let r = unsafe { libc::fcntl(fd_root, libc::F_GETPATH_NOFIRMLINK, buffer.as_mut_ptr()) };
        assert!(r == 0);
        let get_path_nofirmlink_root = std::ffi::CStr::from_bytes_until_nul(&buffer).unwrap();
        println!(
            "(F_GETPATH_NOFIRMLINK) -- fd_root: {:?}",
            get_path_nofirmlink_root
        );

        println!(
            "path: {:?} is a firmlink: {}",
            app_path_root,
            app_path_root.as_c_str() != get_path_nofirmlink_root
        );
        println!(
            "path: {:?} is a firmlink: {}",
            app_path_system,
            app_path_system.as_c_str() != get_path_nofirmlink_root
        );
    }
}

output:

(F_GETPATH) -- fd_system: "/Applications"
(F_GETPATH) -- fd_root: "/Applications"
(F_GETPATH_NOFIRMLINK) -- fd_system: "/System/Volumes/Data/Applications"
(F_GETPATH_NOFIRMLINK) -- fd_root: "/System/Volumes/Data/Applications"
path: "/Applications" is a firmlink: true
path: "/System/Volumes/Data/Applications" is a firmlink: false

@hippietrail
Copy link
Author

In case anyone is looking for info on detecting firmlinks in Darwin/macOS, apparently the only official way to do it is to use getattrlistbulk() to get ATTR_CMN_FLAGS and check if SF_FIRMLINK is set.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants