Skip to content

Commit

Permalink
Merge branch 'master' into fix-1734
Browse files Browse the repository at this point in the history
  • Loading branch information
colton-gabertan committed Jan 30, 2024
2 parents 70e462a + 4377321 commit 45a11e4
Show file tree
Hide file tree
Showing 29 changed files with 1,355 additions and 1,156 deletions.
45 changes: 42 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,34 @@
## master (unreleased)

### New Features

- add Ghidra UI integration #1734 @colton-gabertan @mike-hunhoff

### Breaking Changes

- main: introduce wrapping routines within main for working with CLI args #1813 @williballenthin
- move functions from `capa.main` to new `capa.loader` namespace #1821 @williballenthin

### New Rules (0)

-

### Bug Fixes

### capa explorer IDA Pro plugin

### Development

### Raw diffs
- [capa v7.0.0-beta...master](https://github.com/mandiant/capa/compare/v7.0.0-beta...master)
- [capa-rules v7.0.0-beta...master](https://github.com/mandiant/capa-rules/compare/v7.0.0-beta...master)

## v7.0.0-beta
This is the beta release of capa v7.0 which was mainly worked on during the Google Summer of Code (GSoC) 2023. A huge
shoutout to @colton-gabertan and @yelhamer for their amazing work.

Also a big thanks to the other contributors: @aaronatp, @Aayush-Goel-04, @bkojusner, @doomedraven, @ruppde, and @xusheng6.
### New Features
- add Ghidra backend #1770 #1767 @colton-gabertan @mike-hunhoff
- add dynamic analysis via CAPE sandbox reports #48 #1535 @yelhamer
- add call scope #771 @yelhamer
Expand Down Expand Up @@ -66,24 +93,36 @@
- nursery/hook-routines-via-dlsym-rtld_next [email protected]
- nursery/linked-against-hp-socket [email protected]
- host-interaction/process/inject/process-ghostly-hollowing [email protected]
-

### Bug Fixes
- ghidra: fix `ints_to_bytes` performance #1761 @mike-hunhoff
- binja: improve function call site detection @xusheng6
- binja: use `binaryninja.load` to open files @xusheng6
- binja: bump binja version to 3.5 #1789 @xusheng6
- elf: better detect ELF OS via GCC .ident directives #1928 @williballenthin
- elf: better detect ELF OS via Android dependencies #1947 @williballenthin
- fix setuptools package discovery #1886 @gmacon @mr-tz

### capa explorer IDA Pro plugin

### Development
- update ATT&CK/MBC data for linting #1932 @mr-tz

#### Developer Notes
With this new release, many classes and concepts have been split up into static (mostly identical to the
prior implementations) and dynamic ones. For example, the legacy FeatureExtractor class has been renamed to
StaticFeatureExtractor and the DynamicFeatureExtractor has been added.

Starting from version 7.0, we have moved the component responsible for feature extractor from main to a new
capabilities' module. Now, users wishing to utilize capa’s feature extraction abilities should use that module instead
of importing the relevant logic from the main file.

For sandbox-based feature extractors, we are using Pydantic models. Contributions of more models for other sandboxes
are very welcome!

### Raw diffs
- [capa v6.1.0...master](https://github.com/mandiant/capa/compare/v6.1.0...master)
- [capa-rules v6.1.0...master](https://github.com/mandiant/capa-rules/compare/v6.1.0...master)
- [capa v6.1.0...v7.0.0-beta](https://github.com/mandiant/capa/compare/v6.1.0...v7.0.0-beta)
- [capa-rules v6.1.0...v7.0.0-beta](https://github.com/mandiant/capa-rules/compare/v6.1.0...v7.0.0-beta)

## v6.1.0

Expand Down
8 changes: 6 additions & 2 deletions capa/features/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -458,18 +458,22 @@ def evaluate(self, ctx, **kwargs):
FORMAT_SC32 = "sc32"
FORMAT_SC64 = "sc64"
FORMAT_CAPE = "cape"
FORMAT_FREEZE = "freeze"
FORMAT_RESULT = "result"
STATIC_FORMATS = {
FORMAT_SC32,
FORMAT_SC64,
FORMAT_PE,
FORMAT_ELF,
FORMAT_DOTNET,
FORMAT_FREEZE,
FORMAT_RESULT,
}
DYNAMIC_FORMATS = {
FORMAT_CAPE,
FORMAT_FREEZE,
FORMAT_RESULT,
}
FORMAT_FREEZE = "freeze"
FORMAT_RESULT = "result"
FORMAT_UNKNOWN = "unknown"


Expand Down
4 changes: 2 additions & 2 deletions capa/features/extractors/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@
MATCH_JSON_OBJECT = b'{"'


def extract_file_strings(buf, **kwargs) -> Iterator[Tuple[String, Address]]:
def extract_file_strings(buf: bytes, **kwargs) -> Iterator[Tuple[String, Address]]:
"""
extract ASCII and UTF-16 LE strings from file
"""
Expand All @@ -56,7 +56,7 @@ def extract_file_strings(buf, **kwargs) -> Iterator[Tuple[String, Address]]:
yield String(s.s), FileOffsetAddress(s.offset)


def extract_format(buf) -> Iterator[Tuple[Feature, Address]]:
def extract_format(buf: bytes) -> Iterator[Tuple[Feature, Address]]:
if buf.startswith(MATCH_PE):
yield Format(FORMAT_PE), NO_ADDRESS
elif buf.startswith(MATCH_ELF):
Expand Down
13 changes: 9 additions & 4 deletions capa/features/extractors/elf.py
Original file line number Diff line number Diff line change
Expand Up @@ -866,6 +866,8 @@ def guess_os_from_ident_directive(elf: ELF) -> Optional[OS]:
return OS.LINUX
elif "Red Hat" in comment:
return OS.LINUX
elif "Android" in comment:
return OS.ANDROID

return None

Expand Down Expand Up @@ -921,6 +923,8 @@ def guess_os_from_needed_dependencies(elf: ELF) -> Optional[OS]:
return OS.HURD
if needed.startswith("libandroid.so"):
return OS.ANDROID
if needed.startswith("liblog.so"):
return OS.ANDROID

return None

Expand Down Expand Up @@ -1023,10 +1027,6 @@ def detect_elf_os(f) -> str:
if osabi_guess:
ret = osabi_guess

elif ident_guess:
# we don't trust this too much due to non-cross-compilation assumptions
ret = ident_guess

elif ph_notes_guess:
ret = ph_notes_guess

Expand All @@ -1045,6 +1045,11 @@ def detect_elf_os(f) -> str:
elif symtab_guess:
ret = symtab_guess

elif ident_guess:
# at the bottom because we don't trust this too much
# due to potential for bugs with cross-compilation.
ret = ident_guess

return ret.value if ret is not None else "unknown"


Expand Down
15 changes: 10 additions & 5 deletions capa/features/freeze/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
# https://github.com/mandiant/capa/issues/1699
from typing_extensions import TypeAlias

import capa.loader
import capa.helpers
import capa.version
import capa.features.file
Expand Down Expand Up @@ -681,14 +682,18 @@ def main(argv=None):
argv = sys.argv[1:]

parser = argparse.ArgumentParser(description="save capa features to a file")
capa.main.install_common_args(parser, {"sample", "format", "backend", "os", "signatures"})
capa.main.install_common_args(parser, {"input_file", "format", "backend", "os", "signatures"})
parser.add_argument("output", type=str, help="Path to output file")
args = parser.parse_args(args=argv)
capa.main.handle_common_args(args)

sigpaths = capa.main.get_signatures(args.signatures)

extractor = capa.main.get_extractor(args.sample, args.format, args.os, args.backend, sigpaths, False)
try:
capa.main.handle_common_args(args)
capa.main.ensure_input_exists_from_cli(args)
input_format = capa.main.get_input_format_from_cli(args)
backend = capa.main.get_backend_from_cli(args, input_format)
extractor = capa.main.get_extractor_from_cli(args, input_format, backend)
except capa.main.ShouldExitError as e:
return e.status_code

Path(args.output).write_bytes(dump(extractor))

Expand Down
8 changes: 4 additions & 4 deletions capa/ghidra/capa_ghidra.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ def run_headless():
rules_path = pathlib.Path(args.rules)

logger.debug("rule path: %s", rules_path)
rules = capa.main.get_rules([rules_path])
rules = capa.rules.get_rules([rules_path])

meta = capa.ghidra.helpers.collect_metadata([rules_path])
extractor = capa.features.extractors.ghidra.extractor.GhidraFeatureExtractor()
Expand All @@ -78,7 +78,7 @@ def run_headless():

meta.analysis.feature_counts = counts["feature_counts"]
meta.analysis.library_functions = counts["library_functions"]
meta.analysis.layout = capa.main.compute_layout(rules, extractor, capabilities)
meta.analysis.layout = capa.loader.compute_layout(rules, extractor, capabilities)

if capa.capabilities.common.has_file_limitation(rules, capabilities, is_standalone=True):
logger.info("capa encountered warnings during analysis")
Expand Down Expand Up @@ -119,7 +119,7 @@ def run_ui():
rules_path: pathlib.Path = pathlib.Path(rules_dir)
logger.info("running capa using rules from %s", str(rules_path))

rules = capa.main.get_rules([rules_path])
rules = capa.rules.get_rules([rules_path])

meta = capa.ghidra.helpers.collect_metadata([rules_path])
extractor = capa.features.extractors.ghidra.extractor.GhidraFeatureExtractor()
Expand All @@ -128,7 +128,7 @@ def run_ui():

meta.analysis.feature_counts = counts["feature_counts"]
meta.analysis.library_functions = counts["library_functions"]
meta.analysis.layout = capa.main.compute_layout(rules, extractor, capabilities)
meta.analysis.layout = capa.loader.compute_layout(rules, extractor, capabilities)

if capa.capabilities.common.has_file_limitation(rules, capabilities, is_standalone=False):
logger.info("capa encountered warnings during analysis")
Expand Down
28 changes: 27 additions & 1 deletion capa/helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
# Unless required by applicable law or agreed to in writing, software distributed under the License
# is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and limitations under the License.
import sys
import json
import inspect
import logging
Expand All @@ -16,12 +17,22 @@
import tqdm

from capa.exceptions import UnsupportedFormatError
from capa.features.common import FORMAT_PE, FORMAT_CAPE, FORMAT_SC32, FORMAT_SC64, FORMAT_DOTNET, FORMAT_UNKNOWN, Format
from capa.features.common import (
FORMAT_PE,
FORMAT_CAPE,
FORMAT_SC32,
FORMAT_SC64,
FORMAT_DOTNET,
FORMAT_FREEZE,
FORMAT_UNKNOWN,
Format,
)

EXTENSIONS_SHELLCODE_32 = ("sc32", "raw32")
EXTENSIONS_SHELLCODE_64 = ("sc64", "raw64")
EXTENSIONS_DYNAMIC = ("json", "json_")
EXTENSIONS_ELF = "elf_"
EXTENSIONS_FREEZE = "frz"

logger = logging.getLogger("capa")

Expand Down Expand Up @@ -81,6 +92,8 @@ def get_format_from_extension(sample: Path) -> str:
format_ = FORMAT_SC64
elif sample.name.endswith(EXTENSIONS_DYNAMIC):
format_ = get_format_from_report(sample)
elif sample.name.endswith(EXTENSIONS_FREEZE):
format_ = FORMAT_FREEZE
return format_


Expand Down Expand Up @@ -201,3 +214,16 @@ def log_unsupported_runtime_error():
" If you're seeing this message on the command line, please ensure you're running a supported Python version."
)
logger.error("-" * 80)


def is_running_standalone() -> bool:
"""
are we running from a PyInstaller'd executable?
if so, then we'll be able to access `sys._MEIPASS` for the packaged resources.
"""
# typically we only expect capa.main to be packaged via PyInstaller.
# therefore, this *should* be in capa.main; however,
# the Binary Ninja extractor uses this to resolve the BN API code,
# so we keep this in a common area.
# generally, other library code should not use this function.
return hasattr(sys, "frozen") and hasattr(sys, "_MEIPASS")
19 changes: 12 additions & 7 deletions capa/ida/plugin/form.py
Original file line number Diff line number Diff line change
Expand Up @@ -636,7 +636,7 @@ def on_load_rule(_, i, total):
if ida_kernwin.user_cancelled():
raise UserCancelledError("user cancelled")

return capa.main.get_rules([rule_path], on_load_rule=on_load_rule)
return capa.rules.get_rules([rule_path], on_load_rule=on_load_rule)
except UserCancelledError:
logger.info("User cancelled analysis.")
return None
Expand Down Expand Up @@ -775,7 +775,7 @@ def slot_progress_feature_extraction(text):

meta.analysis.feature_counts = counts["feature_counts"]
meta.analysis.library_functions = counts["library_functions"]
meta.analysis.layout = capa.main.compute_layout(ruleset, self.feature_extractor, capabilities)
meta.analysis.layout = capa.loader.compute_layout(ruleset, self.feature_extractor, capabilities)
except UserCancelledError:
logger.info("User cancelled analysis.")
return False
Expand Down Expand Up @@ -1073,9 +1073,7 @@ def load_capa_function_results(self):

self.view_rulegen_features.load_features(all_file_features, all_function_features)

self.set_view_status_label(
f"capa rules: {settings.user[CAPA_SETTINGS_RULE_PATH]} ({settings.user[CAPA_SETTINGS_RULE_PATH]} rules)"
)
self.set_view_status_label(f"capa rules: {settings.user[CAPA_SETTINGS_RULE_PATH]}")
except Exception as e:
logger.exception("Failed to render views (error: %s)", e)
return False
Expand Down Expand Up @@ -1324,10 +1322,17 @@ def save_function_analysis(self):
idaapi.info("No rule to save.")
return

path = Path(self.ask_user_capa_rule_file())
if not path.exists():
rule_file_path = self.ask_user_capa_rule_file()
if not rule_file_path:
# dialog canceled
return

path = Path(rule_file_path)
if not path.parent.exists():
logger.warning("Failed to save file: parent directory '%s' does not exist.", path.parent)
return

logger.info("Saving rule to %s.", path)
write_file(path, s)

def slot_checkbox_limit_by_changed(self, state):
Expand Down
Loading

0 comments on commit 45a11e4

Please sign in to comment.