-
Notifications
You must be signed in to change notification settings - Fork 273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
failure on elf32-littlearm #1133
Labels
Comments
ivg
added a commit
to ivg/bap
that referenced
this issue
Jun 16, 2020
TL;DR; jumps that cross segments and section boundaries are now treated more thoroughly so that if a jump instruction leads to an invalid execution chain in some other segment, the we will cancel both the chain in the other segment and the chain that led to that jump in the current segment (before it was canceled up to the boundaries of its own segment). Partially fixes BinaryAnalysisPlatform#1133, however since it is Thumb 2.0 binary for BAP it is still mostly random data than something meaningful. Problem ------- Since 2.0 we have the incremental disassembler that supports cross-sectional/cross-segmential jumps. As BinaryAnalysisPlatform#1133 shows sometimes they can go wrong as they were treated specially and had some preferences that regular intersectional jumps didn't have. One of the invariants of our disassembler is that there is no valid chain of execution that will hit the end of segment or data. In other words, that will force the CPU into the invalid instruction state. We allow conservative chains, so that the CPU can still hit an invalid instruction because of a conditional branch (in other words, we allow conditional branches to hit data). To preserve this invariant we maintain a tree of disassembling tasks, so that once we hit data, we can unroll the chain up to the root that started it (or the first conditional branch) and cancel everything in between marking it also as data. This invariant doesn't hold for jumps between sections as when we see a jump instruction that goes out of the current memory region we just assume that once we will get this other region of memory, it will be disassembled nicely. However, later when we actually get access to the memory region that contains the destination (our disassembler is incremental and applied per each chunk of memory as it is discovered) we may figure out that the chain starting from this address is invalid and cancel this chain. However, since we no longer have access to the disassembler state of the original memory region, we can't cancel the chain that led to that jump in the original memory region. Therefore later, when we build the whole program CFG we will start that chain and eventually hit data and end up with an exception. Solution -------- The solution is instead of discarding the task that breaches the segment boundaries we will accumulate it in a debt list, and every time we are handled with a new memory region we first try to payoff the debts. And if the task is now in the boundaries and we can prove that it hits data, then we cancel the whole chain that can now cross section boundaries. Caveats ------- The debt is a list of task and each task references its parent tasks, so in fact it is a tree of instructions covering the whole program. We are storing the debt list in the disassembler state which is saved on the hard drive and if the debt list is large (and since in binary format we can't preserve sharing) it can be quite large to store and to load. So far the assumption is that the debt list is either empty or very small after the project is fully disassembled. If this hypothesis will not turn true, we can either cancel all unpayed debt at the end of disassembling or just ignore it and do not store on the disk.
ivg
added a commit
to ivg/bap
that referenced
this issue
Jun 17, 2020
TL;DR; jumps that cross segments and section boundaries are now treated more thoroughly so that if a jump instruction leads to an invalid execution chain in some other segment, the we will cancel both the chain in the other segment and the chain that led to that jump in the current segment (before it was canceled up to the boundaries of its own segment). Partially fixes BinaryAnalysisPlatform#1133, however since it is Thumb 2.0 binary for BAP it is still mostly random data than something meaningful. Problem ------- Since 2.0 we have the incremental disassembler that supports cross-sectional/cross-segmential jumps. As BinaryAnalysisPlatform#1133 shows sometimes they can go wrong as they were treated specially and had some preferences that regular intersectional jumps didn't have. One of the invariants of our disassembler is that there is no valid chain of execution that will hit the end of segment or data. In other words, that will force the CPU into the invalid instruction state. We allow conservative chains, so that the CPU can still hit an invalid instruction because of a conditional branch (in other words, we allow conditional branches to hit data). To preserve this invariant we maintain a tree of disassembling tasks, so that once we hit data, we can unroll the chain up to the root that started it (or the first conditional branch) and cancel everything in between marking it also as data. This invariant doesn't hold for jumps between sections as when we see a jump instruction that goes out of the current memory region we just assume that once we will get this other region of memory, it will be disassembled nicely. However, later when we actually get access to the memory region that contains the destination (our disassembler is incremental and applied per each chunk of memory as it is discovered) we may figure out that the chain starting from this address is invalid and cancel this chain. However, since we no longer have access to the disassembler state of the original memory region, we can't cancel the chain that led to that jump in the original memory region. Therefore later, when we build the whole program CFG we will start that chain and eventually hit data and end up with an exception. Solution -------- The solution is instead of discarding the task that breaches the segment boundaries we will accumulate it in a debt list, and every time we are handled with a new memory region we first try to payoff the debts. And if the task is now in the boundaries and we can prove that it hits data, then we cancel the whole chain that can now cross section boundaries. Caveats ------- The debt is a list of task and each task references its parent tasks, so in fact it is a tree of instructions covering the whole program. We are storing the debt list in the disassembler state which is saved on the hard drive and if the debt list is large (and since in binary format we can't preserve sharing) it can be quite large to store and to load. So far the assumption is that the debt list is either empty or very small after the project is fully disassembled. If this hypothesis will not turn true, we can either cancel all unpayed debt at the end of disassembling or just ignore it and do not store on the disk.
ivg
added a commit
that referenced
this issue
Jun 17, 2020
TL;DR; jumps that cross segments and section boundaries are now treated more thoroughly so that if a jump instruction leads to an invalid execution chain in some other segment, the we will cancel both the chain in the other segment and the chain that led to that jump in the current segment (before it was canceled up to the boundaries of its own segment). Partially fixes #1133, however since it is Thumb 2.0 binary for BAP it is still mostly random data than something meaningful. Problem ------- Since 2.0 we have the incremental disassembler that supports cross-sectional/cross-segmential jumps. As #1133 shows sometimes they can go wrong as they were treated specially and had some preferences that regular intersectional jumps didn't have. One of the invariants of our disassembler is that there is no valid chain of execution that will hit the end of segment or data. In other words, that will force the CPU into the invalid instruction state. We allow conservative chains, so that the CPU can still hit an invalid instruction because of a conditional branch (in other words, we allow conditional branches to hit data). To preserve this invariant we maintain a tree of disassembling tasks, so that once we hit data, we can unroll the chain up to the root that started it (or the first conditional branch) and cancel everything in between marking it also as data. This invariant doesn't hold for jumps between sections as when we see a jump instruction that goes out of the current memory region we just assume that once we will get this other region of memory, it will be disassembled nicely. However, later when we actually get access to the memory region that contains the destination (our disassembler is incremental and applied per each chunk of memory as it is discovered) we may figure out that the chain starting from this address is invalid and cancel this chain. However, since we no longer have access to the disassembler state of the original memory region, we can't cancel the chain that led to that jump in the original memory region. Therefore later, when we build the whole program CFG we will start that chain and eventually hit data and end up with an exception. Solution -------- The solution is instead of discarding the task that breaches the segment boundaries we will accumulate it in a debt list, and every time we are handled with a new memory region we first try to payoff the debts. And if the task is now in the boundaries and we can prove that it hits data, then we cancel the whole chain that can now cross section boundaries. Caveats ------- The debt is a list of task and each task references its parent tasks, so in fact it is a tree of instructions covering the whole program. We are storing the debt list in the disassembler state which is saved on the hard drive and if the debt list is large (and since in binary format we can't preserve sharing) it can be quite large to store and to load. So far the assumption is that the debt list is either empty or very small after the project is fully disassembled. If this hypothesis will not turn true, we can either cancel all unpayed debt at the end of disassembling or just ignore it and do not store on the disk.
I read in the overview that:
Is this not true for ARM and what exactly is elf32-littlearm (vs ARM)? I am unsure of the distinction. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
While this architecture is not supported by BAP it would be nice if we wouldn't fail on it. So far it looks like that we got confused by the semantics of the last instruction at the end of the block, which is a thumb instruction (in fact two thumb instructions) which have an interpretation in ARM also. We are currently investigating if we can make the disassembler more robust to random inputs (we should), but in general the full support will be provided by #1122
See fkie-cad/cwe_checker#61 for more details.
The text was updated successfully, but these errors were encountered: