Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce the API's cognitive load #12

Open
epage opened this issue Feb 11, 2018 · 9 comments
Open

Reduce the API's cognitive load #12

epage opened this issue Feb 11, 2018 · 9 comments

Comments

@epage
Copy link

epage commented Feb 11, 2018

Right now the API is meant to be both easy to use and (I assume) enforcing invariants in types (like ensuring a file is a file and not a dir before reading it). The challenge is that there are a lot of types for the user to deal with and its hard to see where one can get started.

@epage
Copy link
Author

epage commented Feb 11, 2018

I can see this being improved in a mixture of two ways

  • Sacrifice leveraging the type system
  • Provide a better jump off point in the docs so you progressively reveal the structs that build off of others.

@vitiral
Copy link
Collaborator

vitiral commented Feb 11, 2018

@epage thanks for opening up all these great issues!

There are really only two groups of types: Path types and File types. Let's explore them and see if there are areas we could reduce.

Note: I am the author of path_abs so any decision we make can be made directly to that crate.

Path Types

First of all, there is the stdlib Path and PathBuf. I do not want to overlap with their names and I want to use them as the API to emulate/expand upon.

After that there are the following:

  • PathArc: I agree that we could maybe rename this (PathArc's name exposes implementation detail #9), but this is essentially just a PathBuf that is reference counted and has better methods. I don't think you can get rid of this since it is necessary for downcasting the other types without cost.
  • PathAbs: a path that is guaranteed to be absolute. If we were going to "hide" any of the paths this would be my choice (but implement Eq to compare all the others)
  • PathFile: a path to a file with methods that you can do on a file.
  • PathDir: a path to a directory with methods that you can do on a directory.
  • PathType: a path to either a file or directory, i.e. what you get when you list a directory.
  • PathTmp: a PathDir that is automatically deleted when it goes out of scope (wraps tempdir::TempDir).

There is serious overlap between PathAbs and PathType so I would be okay getting rid of PathAbs (hilariously this is the name of the crate... might have to rename path_abs to something else?)

File Types

The following file types are exposed to the user:

  • FileRead: a read-only file handle with an attached path.
  • FileWrite: a write-only file handle with an attached path.
  • FileEdit: a read/write file handle with an attached path
  • (future) FileTmp a temporary file handle, wraps tempfile::TempFile.

Honestly file types are one of my big annoynances. The OpenOptions is error prone since the returned object is impl Write even if you requested a read-only file. I'm open to suggestions here as well though!

Possible Improvement path: file/dir modules

I think it is worth mentioning that the Path types have way too many methods IMO. I would like to maybe have a file and dir module for accomplishing what PathFile::read_str currently accomplishes. I would also like to remove things like PathFile::read() -> FileRead as it is straight up redundant.

Conclusion

Part of the reason there are "so many types" is because the operations you can do on the file system are so varied. When you list a directory, what are you getting back? Well... it could be files or directories (symlinks are auto-resolved). PathType doesn't hide that from you anymore and in rust style forces you to handle both possibilities.

The type system prevents programmer error when using methods and I think that is well worth the expanded API.

@vitiral
Copy link
Collaborator

vitiral commented Feb 11, 2018

So I mostly agree with your second point, which is to improve the docs to "tell the story better."

This brings up another point: where do these docs live? I see the ergo_fs docs as "library/reference" docs and I think just stating the types is correct there. IMO the Ergo Cookbook is the right place to tell this kind of "story" and slowly unveil the types.

@epage
Copy link
Author

epage commented Feb 12, 2018

see the ergo_fs docs as "library/reference" docs and I think just stating the types is correct there.

I think its reasonable for the crate-wide doc string to tell a brief story but I do agree that there will need to be docs outside of that. This again fits into my idea of there being an ergo org. You could then have an ergo.github.io repo for mdbook + other docs to consolidate the ergo themed docs.

@epage
Copy link
Author

epage commented Feb 12, 2018

From your brief description, I can't really tell the difference in purpose

  • PathArc
  • PathAbs: Yes, this is absolute, but why does that even matter?
  • PathType

They all seem to fulfill the same purpose.

What is your goal with having distinct PathFile and PathDir?

Maybe its just me, but with PathTmp, I'd personally separate out ownership vs path.

  • I'd expect a Path to be cloneable
  • into on a path is reasonable ... except that will cause you to drop the TempDir
  • The only way I can pass this around to other places is to call as_path, dropping from ergo to std because of the problems noted above.
  • This all leads to it being less intuitive to consider the ownership.

@epage
Copy link
Author

epage commented Feb 12, 2018

(future) FileTmp a temporary file handle, wraps tempfile::TempFile.

While this might be annoying from the implementation, but should this just be a different factory on the other File types?

How well can I pass FileEdit to functions in place of FileRead or FileWrite?

@vitiral
Copy link
Collaborator

vitiral commented Feb 12, 2018

All of the path types (except PathArc) are canonicalized so that you can always compare them, ask has_prefix, etc and it will give the "expected" answer.

PathArc exists as a zero-cost escape hatch when you want to stop caring about this.

The goal of PathFile vs PathDir is to:

  • Provide expressiveness. I.e. you can say: struct Paths {files: Set<PathFile>, dirs: Set<PathDir>}
  • Provide some degree of type safety: the above example will be guaranteed that those were actually files (at least at one time)
  • Provide related methods: you shouldn't be able to list() a PathFile and you shouldn't be able to read_string() a PathDir.

Maybe its just me, but with PathTmp...

Agh, these are issues! I opened #14

While this might be annoying from the implementation, but should this just be a different factory on the other File types?

I don't think so -- it has to have ownership to handle dropping correctly.

How well can I pass FileEdit to functions in place of FileRead or FileWrite?

Well, FileEdit implements both Read and Write so you could use it in that way. I suppose we should probably provide AsRef<FileRead> etc as well for it!

@epage
Copy link
Author

epage commented Feb 12, 2018

While this might be annoying from the implementation, but should this just be a different factory on the other File types?

I don't think so -- it has to have ownership to handle dropping correctly.

Unlike Path, File* already have an ownership model compatible with Temp*. So one way to merge FileTmp could be by having an Option<TempFile> inside of File*.

@epage
Copy link
Author

epage commented Feb 12, 2018

All of the path types (except PathArc) are canonicalized so that you can always compare them, ask has_prefix, etc and it will give the "expected" answer.

PathArc exists as a zero-cost escape hatch when you want to stop caring about this.

An alternative to always-canonicalized is #10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants