An indexer and graph generator for binary build directories.
This tool can be used to quickly search where an ELF symbol matching some criteria is defined in a directory and generate graphs for various queries.
Features:
- Parallel, incremental indexing using sharded Xapian database
- Indexes cross-references by analyzing ELF relocations
- Query the database with a simple query language and custom output formats
- Generate symbol reference graphs in DOT format
With pip:
pip install .
Or, for development:
pip install -e .[dev]
For optional graph generation (this installs pygraphviz
):
pip install .[graphs]
xapian is required to be installed on the system.
You need Xapian Python bindings, you can get them:
-
By installing unofficial Xapian bindings Python package with:
pip install xapian-bindings
-
By running the provided install_xapian_bindings.sh script
-
By manually downloading and installing them from Xapian download page
To index a project that contains a compile_commands.json
file:
bdx index -c
Or you can specify the directory to index:
bdx index -d ./build
The indexer will only index files changed since last run.
The index
command also accepts -o
, --opt
option which can be used to set
some indexing settings, e.g. to disable indexing relocations:
bdx index -d ./build --opt index_relocations=False
Available options:
-
num_processes
- number of parallel indexing processes (default=same as # of CPUs). -
index_relocations
- if True (the default), all relocations will be applied and indexed. Setting this to False can significantly speed up indexing. -
min_symbol_size
- (default 1) only index symbols with size equal to or greater than this. -
use_dwarfdump
- if True (the default), usedwarfdump
program, if it's available, to find the source file for a compiled file, if it can't be found in any other way.
Generate an SVG image showing at most 20 routes from symbol main
in
main.o
to all symbols in section .text
in files matching wildcard
Algorithms_*
:
bdx graph 'main path:main.o' 'section:".text" AND path:Algorithms*' -n 20 | dot -Tsvg > graph.svg
By default this generates paths by using the ASTAR algorithm, the --algorithm BFS
or --algorithm DFS
options will use
breadth-first-search/depth-first-search algorithms which can generate different
graphs and can be slower/faster depending on the index and the queries
provided.
bdx search
command accepts a query string. A simple query language is
recognized.
$ bdx search -n 5 tree
tree-eh.o: _ZL20outside_finally_tree8treempleP6gimple
hooks.o: _Z14hook_void_treeP9tree_node
tree-eh.o: _ZL22record_in_finally_tree8treempleP4gtry
langhooks.o: _Z20lhd_return_null_treeP9tree_node
langhooks.o: _Z23lhd_tree_dump_dump_treePvP9tree_node
The -n
option sets the maximum number of symbols to search for.
The --demangle-names
option makes the demangled name available to use in the
output format string.
The -f
option can be used to set output format (json
, sexp
or Python string format spec):
$ bdx search -n 5 --demangle-names -f json tree
{"path": "/src/gcc-12/build/stage1-gcc/tree-eh.o", "name": "_ZL20outside_finally_tree8treempleP6gimple", "section": ".text", "address": 12255, "size": 104, "type": "FUNC", "relocations": ["", "_ZN10hash_tableI19finally_tree_hasherLb0E11xcallocatorE4findERKP17finally_tree_node"], "mtime": 1652372105820280262, "demangled": "outside_finally_tree(treemple, gimple*)"}
{"path": "/src/gcc-12/build/prev-gcc/hooks.o", "name": "_Z14hook_void_treeP9tree_node", "section": ".text", "address": 560, "size": 1, "type": "FUNC", "relocations": [], "mtime": 1652375092039025278, "demangled": "hook_void_tree(tree_node*)"}
{"path": "/src/gcc-12/build/gcc/tree-eh.o", "name": "_ZL22record_in_finally_tree8treempleP4gtry", "section": ".text", "address": 13440, "size": 415, "type": "FUNC", "relocations": ["", "_Z11fancy_abortPKciS0_", "_ZN10hash_tableI19finally_tree_hasherLb0E11xcallocatorE6expandEv", "prime_tab", "xmalloc"], "mtime": 1652377778150208461, "demangled": "record_in_finally_tree(treemple, gtry*)"}
{"path": "/src/gcc-12/build/stage1-gcc/langhooks.o", "name": "_Z20lhd_return_null_treeP9tree_node", "section": ".text", "address": 278, "size": 15, "type": "FUNC", "relocations": [], "mtime": 1652372076295950259, "demangled": "lhd_return_null_tree(tree_node*)"}
{"path": "/src/gcc-12/build/stage1-gcc/langhooks.o", "name": "_Z23lhd_tree_dump_dump_treePvP9tree_node", "section": ".text", "address": 1692, "size": 19, "type": "FUNC", "relocations": [], "mtime": 1652372076295950259, "demangled": "lhd_tree_dump_dump_tree(void*, tree_node*)"}
$ bdx search -n 5 --demangle-names -f '0x{address:0>10x}|{section:<10}|{type:8}|{demangled}' tree
0x0000002fdf|.text |FUNC |outside_finally_tree(treemple, gimple*)
0x0000000230|.text |FUNC |hook_void_tree(tree_node*)
0x0000003480|.text |FUNC |record_in_finally_tree(treemple, gtry*)
0x0000000116|.text |FUNC |lhd_return_null_tree(tree_node*)
0x000000069c|.text |FUNC |lhd_tree_dump_dump_tree(void*, tree_node*)
-
Search for symbols having
foo
ANDbar
somewhere in their name:bdx search foo AND bar
or:
bdx search foo bar
-
Search for symbols having either
foo
orbar
in their name:bdx search foo OR bar
-
Search for symbols named exactly
foo
:bdx search fullname:foo
-
Search for symbols where Elf ST_INFO type is
STT_FUNC
orSTT_OBJECT
:bdx search type:FUNC OR type:OBJECT
-
Search for symbols
foo*
in binary files namedbar.o
:bdx search 'name:foo*' path:bar.o
-
Search for symbols in files compiled from source file named
file.c
:bdx search source:file.c
-
Search for symbols
foo
orbar
that are not mangled (_Z*
prefix):bdx search '(foo OR bar)' AND NOT name:_Z*
-
Search for symbols that reference/call
memset
:bdx search relocations:memset
-
Search for symbols that call
malloc
, but notfree
:bdx search relocations:malloc NOT relocations:free
-
Search for symbols with size in some range, where address is at least 0xfff0:
bdx search foo size:100..200 address:0xfff0..
-
Search for symbols by relative path of the binary:
bdx search 'path:./build/module/*'
-
Search for string literals:
bdx search 'path:"/path/to/File With Spaces.o"'
-
Search for big symbols in some section:
bdx search section:.rodata AND size:1000..
Copyright (C) 2024 Michał Krzywkowski
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.