Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add flavored scopes #1580

Merged
merged 40 commits into from
Jul 12, 2023
Merged

Conversation

yelhamer
Copy link
Collaborator

@yelhamer yelhamer commented Jul 1, 2023

This PR proposes an implementation for the concept of flavors (as discussed in #1517 ).

In this PR, each rule is assumed to have two different interpretations depending on what type of extractor it is matching against (Dynamic or Static). If the extractor is static, it will be looking to match against static scopes and will thus fetch the static interpretation of the rule form the RuleSet class, and if it is a dynamic extractor, it will do the same by fetching the dynamic interpretation. A rule may lack one interpretation/flavor at most (either a static scope or a dynamic one must be specified).

One advantage of this design is we can have rules written for one analysis mode (static for example) be used in a dynamic analysis context. For example the following rule:

rule:
  meta:
    name: create a socket
    scope:
      static: function
      dynamic: thread
  features:
    - or:
      - api: socket
      - and:
        - instruction:
          - mnemonic: syscall
          - number: 41

In this case, if the rule is being matched against features extracted by a dynamic extractor then the API(socket) feature would still match, and thereby the whole rule would still match and it won't be lost.

Disadvantages of this approach is the the Flavor class feels like a hack. I feel like it could be nice if we could rename that class into Scope, and find another way to represent the Scope enum class.

Checklist

  • No CHANGELOG update needed
  • No new tests needed
  • No documentation update needed

@yelhamer yelhamer added breaking-change introduces a breaking change that should be released in a major version gsoc Work related to Google Summer of Code project. dynamic related to dynamic analysis flavor labels Jul 1, 2023
@yelhamer
Copy link
Collaborator Author

yelhamer commented Jul 2, 2023

Currently, the scope member of each Rule object is a Flavor object, and I think that class name (Flavor) is pretty confusing and should be renamed. Ideally, we would rename that class to Scope, but that's already taken. So I suggest that we either rename the current Scope enum class (maybe to ScopeEnum ?), or we rename Flavor to RuleScope. What are your thoughts on that? and do you have any other suggestions?

@ghost
Copy link

ghost commented Jul 3, 2023

first: the description text for this PR is really good! i appreciate the summary of the background, the example of the rule, and what this means for users. really nice job.

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice start to the changes to the rule syntax. i've made some comments inline, though im not sure whether or not they'll be easy to understand. so, i'll try to summarize them here, and don't hesitate to reach out with more questions.

  1. only support the case when both static & dynamic scopes are provided. make any other case an error (for now). if we want to add special handling, lets do that later when we know the easy case works well.

  2. update rule syntax to have meta.scopes instead of meta.scope. the type of this thing should be Scopes, which replaces your Flavor class (but looks approximately the same). this way we don't have to use the term "flavor" very much/at all :-). because each rule should have multiple scope entries, it makes sense for the field name to be plural. then, when we say "scope" we mean a single scope, like function, and when we say "scopes" we mean the mapping from flavor to scope, like {static: function, dynamic: thread}.

  3. try to work with single scopes (Scope, singular) wherever possible. if a function accepts a Scopes (plural) argument, then its return value should be a mapping (like a Dict) from flavor to some result. for example, if you want a function ensure_feature_valid_for_scopes (Scopes, plural), then the result should indicate if the given feature is valid for each provided scope. but i think this should be uncommon, and you should prefer the singular when possible. this is because our matching engine won't be mixing rule logic across flavors, so i think its unlikely that we'll want to mix any data structure or logic across flavors.

  4. therefore, if we need to build different rule logic trees, one for each flavor, that's fine. this might take a bit of work - feel free to propose a design or start a discussion if you have questions.

the tests look great, so if those can continue passing while we address the above, we'll know things will be ready to merge.

tests/test_rules.py Show resolved Hide resolved
capa/rules/__init__.py Outdated Show resolved Hide resolved
capa/rules/__init__.py Outdated Show resolved Hide resolved
capa/rules/__init__.py Outdated Show resolved Hide resolved
capa/rules/__init__.py Outdated Show resolved Hide resolved
capa/rules/__init__.py Outdated Show resolved Hide resolved
capa/rules/__init__.py Outdated Show resolved Hide resolved
capa/rules/__init__.py Outdated Show resolved Hide resolved
@yelhamer yelhamer marked this pull request as draft July 5, 2023 10:45
@yelhamer
Copy link
Collaborator Author

yelhamer commented Jul 7, 2023

I have xfailed all the tests that rely on old rules (has scope field and can't be translated yet), and build_statement() now makes sure that the statement is valid for both scopes.

@williballenthin
Copy link
Collaborator

@yelhamer can you work on the merge conflicts?

@yelhamer
Copy link
Collaborator Author

sure, will do.

@williballenthin
Copy link
Collaborator

williballenthin commented Jul 11, 2023

image

for these, update here:

capa/.github/ruff.toml

Lines 51 to 61 in c1cd272

"tests/test_main.py" = ["F401", "F811"]
"tests/test_proto.py" = ["F401", "F811"]
"tests/test_freeze.py" = ["F401", "F811"]
"tests/test_function_id.py" = ["F401", "F811"]
"tests/test_viv_features.py" = ["F401", "F811"]
"tests/test_binja_features.py" = ["F401", "F811"]
"tests/test_pefile_features.py" = ["F401", "F811"]
"tests/test_dnfile_features.py" = ["F401", "F811"]
"tests/test_dotnet_features.py" = ["F401", "F811"]
"tests/test_result_document.py" = ["F401", "F811"]
"tests/test_dotnetfile_features.py" = ["F401", "F811"]

@yelhamer yelhamer requested a review from williballenthin July 11, 2023 14:26
Copy link
Collaborator

@williballenthin williballenthin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@williballenthin
Copy link
Collaborator

@yelhamer please resolve conflicts and then i'll merge

@yelhamer yelhamer requested a review from mr-tz July 12, 2023 11:46
@mr-tz
Copy link
Collaborator

mr-tz commented Jul 12, 2023

looks like there's still conflicts, happy to merge after those are resolved

@mr-tz
Copy link
Collaborator

mr-tz commented Jul 12, 2023

capa/ida/plugin/form.py:1195:58: C419 Unnecessary list comprehension passed to any() prevents short-circuiting - rewrite as a generator.

sorry if these are unrelated issues to this PR :(

@yelhamer
Copy link
Collaborator Author

ah no it's my fault. I switched mid-way through from using the '==' operator to using the 'in' one, but forgot to update the ida plugin again with that.

@yelhamer
Copy link
Collaborator Author

@mr-tz checks passed, and should be ready to merge!

@mr-tz mr-tz merged commit ce15a2b into mandiant:dynamic-feature-extraction Jul 12, 2023
@mr-tz
Copy link
Collaborator

mr-tz commented Jul 12, 2023

great work!

@williballenthin
Copy link
Collaborator

🥳

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking-change introduces a breaking change that should be released in a major version dynamic related to dynamic analysis flavor gsoc Work related to Google Summer of Code project.
Projects
Status: done
Development

Successfully merging this pull request may close these issues.

update rule syntax to support scope specification for dynamic analysis flavor
3 participants