Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various new CLI options. #13

Open
wants to merge 15 commits into
base: master
Choose a base branch
from
Open

Various new CLI options. #13

wants to merge 15 commits into from

Conversation

toofar
Copy link

@toofar toofar commented Jul 13, 2014

First of all sorry for having everything on the master branch, I got a bit carried away. If you would prefer to have logical changes on separate branches, github style, I have no problem with that. It will have to wait for tomorrow though, I figure I should take a break before I end up implementing pretty colours or something.

These changes a bunch of new command line options for: matching/filtering filehandles based on directory, case insensitively and file size; matching/filtering processes based on full command line and using wildcards.
It also allows multiple commands (-c) and directories (-D) to be specified at once using commas as delimiter (eg cv -c firefox,wget). Also there is one (premature) optimisation: go through /proc once and try each of the commands for each pid rather than going through /proc for each command.

  -V --verbose          print non-exceptional errors (eg permission denied)
  -g --glob             use fnmatch() when matching process names instead of strcmp()
  -f --full             match against the entire command line instead of just the command name, implies --glob
                        Note that this is an exact match, use -g and \*pattern\* to get a substring match
  -i --case-insensitive match case insensitively
  -D --directory dir    filter results to processes handling files in dir, comma separated
  -S --size bytes       filter results to processes handling files larger than bytes

PS: The diff below may look bigger than it actually is but a bunch of stuff just changed indentation.

@Xfennec
Copy link
Owner

Xfennec commented Jul 14, 2014

What a huge PR, thanks for all this work ! I will have a look, commit by commit, to all this. Some are very interesting, some would require a few test before validation, and for a few ones, I still need to be convinced, as --globfor instance (I love simple things, and too much CLI switches can give a bad feeling to the user discovering the tool, I think).

Let me have a look to all this, and thanks again.

@Xfennec
Copy link
Owner

Xfennec commented Jul 18, 2014

Is there any way (with git CLI) to cherry pick some of these commits ? (from this branch to the HEAD master) That's something I've never done yet.

@toofar
Copy link
Author

toofar commented Jul 18, 2014

Haha, I just looked into this recently (haven't actually done it though). Mostly from this post http://blog.spreedlycom/2014/06/24/merge-pull-request-considered-harmful/. You just add the pr as a new branch (origin pull/ID/head:new-local-branch 1 3) and merge/whatever from there onto master as you please. When you are done (so in the final commit) add "Closes #1234", where 1234 is the pr ID, into the commit message and that should close the pr 2.

toofar added 15 commits July 18, 2014 20:16
This is useful for when the program was called via a symlink. For example on
my system `mplayer` is an alias to `mplayer2`. Since I launch it with
`mplayer` if I wanted to check the progress using cv I would expect to use `cv
-c mplayer` but that currently doesn't work. The original command line
(assuming argv[0] hasn't been modified) has this information:

    $ readlink `which mplayer`
    /etc/alternatives/mplayer
    $ readlink -f `which mplayer`
    /usr/bin/mplayer2

    $ readlink /proc/`pidof mplayer`/exe
    /usr/bin/mplayer2
    $ cat /proc/`pidof mplayer`/cmdline | tr '\0' ' ' | cut -d' ' -f1
    mplayer

Currently I am checking for a full basename match on the executable but it may
be nice to add a -f,--full like pgrep does. Also -g,--glob (fnmatch) my be
useful for partial matches with -c.

This adds an more overhead for every unsuccessful /exe lookup though :(. If
this becomes an option there are a couple of options like: only do this extra
check if -c was passed (but then symlinks for cp (eg from busybox) wouldn't
work), hiding it behind an option, caching.
Might be nice to allow -c to be present on the CLI multiple times too.
Using strerror instead of perror so we can print the pid/path info as well.
Now you can eg `cv -gc chrome\*,mplay\*`.

Not sure about the flags to fnmatch, just read through the man page and
thought they looked appropriate.

Had to switch the printouts (`pid_list[N].name`) to use `basename(exe)` rather
than bn_name because printing out the search pattern with glob isn't all that
useful.
Matches the search pattern against the entire command line from
/proc/pid/cmdline instead of just the command name (argv[0]).
Since we are currently doing an exact match it is only really useful with eg
`cv -fgc \*pattern\*`. Hence implies `--glob`. If we do support substring
matches at some point it might be nice to make `-f` require `-c`.

Since this modifies the string that is copied to `pidinfo_list[i].name` it
could print the wrong process name in the info printout. Specifically since
the null characters from /proc/pid/cmdline are converted to spaces we lose the
distinction between arguments with spaces in them and argument delimiters.
Since we only care about the binary name here then this means that if your
binary has a space in it only the first bit of the name will be printed. This
is surely rare though, no processes in my PATH have spaces in the name.

Removed FNM_PATHNAME on the fnmatch(..., cmdline) call or else it wasn't
matching patterns like `myprog\*` against `myprog /target/dir`.

Also, with `--full` the comparison is no longer against the command basename,
in case people wan't to match against the path or `./thinger` or whatever.
Converts the current binary name/command line to lower case before comparing.
Since this modifies the string that is saved to `pid_list[N].name` then if
your program hase capitals in the name it will be printed out wrong. The only
relevant procces in my PATH is GET.
Now instead of going through proc for each binary we are interested in go
through proc once and loop in the process names for each pid.

I wasn't actually seeing any performance issues so this is really a premature
optimisation...

The diff for `find_pids_by_binary_name` may look really big but that is only
because the indentation changed because of the new inner loop. The only thing
that changed in the inner loop is that I changed a couple of `break`s into
`goto`s to break out of the outer loop. If you don't like `goto`s you could
just add an additional `max_procs` check at the top of the outer loop.

Changed proc_names to be no longer null terminated just so it matches up with
the array generated from the `-c` argument.
So that it can be used for other comma separated argument lists.
Implement an `lsof` alike `-D` argument. Eg

    cv -gc \* -D /tmp

will look through all processes to see if they have a file open in /tmp. The
fd selection for the actual printout is also filtered although it is possible
to not filter it there, I can't think of a reason why you would want that
though.

Changes to the two main functions to accept directory filter arguments and to
accept NULL/0 for some values. If you call `find_fd_for_pid` with NULL for the
output list it will just return a count of interesting fds.

TODO: Only works for absolute paths at the moment. Need to use realpath.
Use realpath. I am re-using the fullpath array because I like premature
optimization.

Also, realpath expands all symlinks. This may lead to a confusing user
experience. Except that the links under /proc/pid/fd/ all have symlinks
expanded to so it is really unavoidable.
So now you can just check the seex status of any large files:

    cv -gc \* -S $((1024*1024*1024))

TODO: Implement size suffixes like 1M, 1m, 1G etc.

This argument is probably of dubious worth, but it wasn't hard to add it
anyway.
This is a lot nice than raw bytes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants