Add encode-data-using-add-xor-sub-operations.yml #800

jtothej · 2023-07-15T10:04:37Z

Add encode-data-using-add-xor-sub-operations.yml
The rule feels a bit brute force-y but I'm not sure if there's more elegant way to do it.

williballenthin

interesting idea. i'm also not sure if this is the right way to express the rule. can you imagine a different syntax or other feature that would let you express this more easily? could we introduce another characteristic that would enable this?

williballenthin · 2023-07-15T10:47:25Z

data-manipulation/encoding/encode-data-using-add-xor-sub-operations.yml

+    - and:
+      - characteristic: tight loop
+      - instruction:
+        - mnemonic: xor


this could be replaced with characteristic(nzxor) (though technically that also matches xor reg,reg too).

With this rule I wanted to express the following logic: require ADD, SUB and XOR instructions to be present in a single basic block and require immediate value operand for ADD, SUB and XOR instructions to be within (0x1,0xff) range.

characteristic: nzxor could work here but I chose to be as specific as possible - code I'm trying to match typically looks like this:

add dl, 4Fh xor dl, 0F1h sub dl, 4Fh

One thing that I think could make the rule a bit more cleaner would be support for ranges or list of values by the number feature, so e.g.

- and: - characteristic: tight loop - instruction: - mnemonic: xor - operand[1].number: (0x1,0xff) - instruction: - mnemonic: add - operand[1].number: (0x1,0xff) - instruction: - mnemonic: sub - operand[1].number: (0x1,0xff)

would this work as an approximation?

- not: - operand[1].number: 0

this would include larger values though, so may be too broad?

All code samples I've seen use single byte values as ADD/XOR/SUB operands so any non-zero value might be too broad.

hm, ok, I'm not sure how to best proceed here then without changing the rule syntax (not worth it ATM IMHO) or just merging as is

@williballenthin are we worried about performance here? @jtothej could we add other features to quickly filter out likely FPs (large functions, etc.)?

@mr-tz i agree that its probably not worth changing the rule syntax for a single rule today. if we can come up with other places the syntax change is useful then lets reconsider.

@mr-tz i am somewhat worried about performance. i think this can be assessed by running the test cases before/after this PR and seeing if there's any meaningful delta. i wouldn't be surprised if my intuition is wrong about performance, so this shouldn't be a blocker without real data. sorry @jtothej for bogging down the PR merge with this worry.

@jtothej i think we should consider removing some common, small values from the ADD/SUB branches to account for iteration over an array, like bytewise processing of a string, wstring, etc. So maybe remove 1/2/4/8/16 from ADD/SUB branches. thoughts?

williballenthin

with a review of the test case matches (assume it works) and other matches in our test corpus (TODO), then a sanity check on performance impact, then im ok with this being merged.

williballenthin · 2023-10-16T11:28:41Z

data-manipulation/encoding/encode-data-using-add-xor-sub-operations.yml

+          - operand[1].number: 0x01
+          - operand[1].number: 0x02


Suggested change

- operand[1].number: 0x01

- operand[1].number: 0x02

common string/wstring index deltas, remove to avoid FPs?

note this is based on intuition, so feel free to push back/discuss.

williballenthin · 2023-10-16T11:28:54Z

data-manipulation/encoding/encode-data-using-add-xor-sub-operations.yml

+          - operand[1].number: 0x01
+          - operand[1].number: 0x02
+          - operand[1].number: 0x03
+          - operand[1].number: 0x04


Suggested change

- operand[1].number: 0x04

common array index deltas, remove to avoid FPs?

note this is based on intuition, so feel free to push back/discuss.

williballenthin · 2023-10-16T11:29:01Z

data-manipulation/encoding/encode-data-using-add-xor-sub-operations.yml

+          - operand[1].number: 0x05
+          - operand[1].number: 0x06
+          - operand[1].number: 0x07
+          - operand[1].number: 0x08


Suggested change

- operand[1].number: 0x08

common array index deltas, remove to avoid FPs?

note this is based on intuition, so feel free to push back/discuss.

williballenthin · 2023-10-16T11:29:19Z

data-manipulation/encoding/encode-data-using-add-xor-sub-operations.yml

+          - operand[1].number: 0x01
+          - operand[1].number: 0x02


Suggested change

- operand[1].number: 0x01

- operand[1].number: 0x02

common string/wstring index deltas, remove to avoid FPs?

note this is based on intuition, so feel free to push back/discuss.

mr-tz · 2023-11-20T10:15:48Z

according to offline discussions, Willi's suggestions above or this approximation both work also

  features:
    - and:
      - count(basic blocks): 6 or fewer
      - basic block:
        - and:
          - characteristic: tight loop
          - characteristic: nzxor
          - count(mnemonic(add)): 1
          - count(mnemonic(sub)): 1

data-manipulation/encoding/encode-data-using-add-xor-sub-operations.yml

…tions.yml Co-authored-by: Moritz <[email protected]>

data-manipulation/encoding/encode-data-using-add-xor-sub-operations.yml

…tions.yml

mr-tz · 2023-11-22T13:10:18Z

Thanks a lot!

Add encode-data-using-add-xor-sub-operations.yml

bd520e7

jtothej mentioned this pull request Jul 15, 2023

Add test file for encode-data-using-add-xor-sub-operations.yml mandiant/capa-testfiles#201

Merged

williballenthin reviewed Jul 15, 2023

View reviewed changes

williballenthin approved these changes Oct 16, 2023

View reviewed changes

williballenthin reviewed Oct 16, 2023

View reviewed changes

mr-tz reviewed Nov 22, 2023

View reviewed changes

data-manipulation/encoding/encode-data-using-add-xor-sub-operations.yml Outdated Show resolved Hide resolved

Update data-manipulation/encoding/encode-data-using-add-xor-sub-opera…

f2a2f91

…tions.yml Co-authored-by: Moritz <[email protected]>

mr-tz reviewed Nov 22, 2023

View reviewed changes

data-manipulation/encoding/encode-data-using-add-xor-sub-operations.yml Outdated Show resolved Hide resolved

Update data-manipulation/encoding/encode-data-using-add-xor-sub-opera…

3a235ba

…tions.yml

mr-tz merged commit 6ad4499 into mandiant:master Nov 22, 2023
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add encode-data-using-add-xor-sub-operations.yml #800

Add encode-data-using-add-xor-sub-operations.yml #800

jtothej commented Jul 15, 2023

williballenthin left a comment

williballenthin Jul 15, 2023

jtothej Jul 17, 2023

mr-tz Jul 19, 2023

jtothej Jul 19, 2023

mr-tz Jul 20, 2023

mr-tz Oct 9, 2023

williballenthin Oct 16, 2023 •

edited

Loading

williballenthin left a comment

williballenthin Oct 16, 2023

williballenthin Oct 16, 2023

williballenthin Oct 16, 2023

williballenthin Oct 16, 2023

mr-tz commented Nov 20, 2023

mr-tz commented Nov 22, 2023

Add encode-data-using-add-xor-sub-operations.yml #800

Add encode-data-using-add-xor-sub-operations.yml #800

Conversation

jtothej commented Jul 15, 2023

williballenthin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

williballenthin Oct 16, 2023 • edited Loading

Choose a reason for hiding this comment

williballenthin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mr-tz commented Nov 20, 2023

mr-tz commented Nov 22, 2023

williballenthin Oct 16, 2023 •

edited

Loading