Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial Cutter integration #65

Open
wants to merge 19 commits into
base: develop
Choose a base branch
from

Conversation

xarkes
Copy link
Contributor

@xarkes xarkes commented Apr 10, 2019

This PR aims at adding lighthouse support for Cutter. Feel free to tell me if anything is wrong, I won't have much time myself to improve the PR more, but hopefully some people will do :)

Screenshot
image

How to install

  • Same as in IDA and Binary Ninja: Copy ligthouse and lighthouse_plugin.py into ~/.local/share/RadareOrg/Cutter/plugins/python/

Known issues:

  • Fix functionRenamed() (hopefully with cutter 1.8.2? https://github.com/radareorg/cutter/issues/1482)
  • Need correct macOS directory for get_disassembler_user_directory() in cutter_api.py
  • The graph view does not refresh automatically when switching coverage sets
  • No highlight in the disassembly widget (only graph)
  • Add support for Right Click -> Coverage XRef menu action (c.f. Feature request: Hover & show trace covering block #8)
  • Review / cleanup threading code
  • self._stream.write(buf) crashes (at least on windows) cuz cutter doesn't have a STDOUT stream

#@not_mainthread
# TODO Reenable not_mainthread
def get_function_name_at(self, address):
# TODO User Cutter API

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use*

@@ -566,7 +567,10 @@ def _update_functions(self, fresh_metadata):
#

if new_metadata.empty:
del self.functions[function_address]
try:
del self.functions[function_address]
Copy link
Owner

@gaasedelen gaasedelen Apr 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may have been to account for a bug I fixed in a55ede7, I will see if it is safe to remove this change from your code now.

# Cutter already provides this information, so just fetch it
if disassembler.NAME == "CUTTER":
try:
return int(cutter.cmd('afCc @ ' + str(self.address)))
Copy link
Owner

@gaasedelen gaasedelen Apr 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to avoid inline disassembler casing. We have the disassembler API abstraction explicitly so we don't have to have if/else trees everywhere. It makes the code both more maintainable, and readable.

The exception to that rule is mostly class monkey patching, which is done to ensure performance for hot paths.

I'll evaluate whether it is reasonable to add complexity calculations to the disassembler API, or if it is okay to simply use the existing lighthouse code. Complexity calculations are already pretty fast in lighthouse, but I understand that r2 might offer more 'accurate' scores.

@gaasedelen
Copy link
Owner

Thanks for the PR! You have done great work, and this is an awesome contribution for the project to receive from the community.

I went through the PR and made some initial notes. In particular, the code I question and scrutinize the most from any contribution is code that affects everyone -- eg, any changes that are executed on every platform and disassembler.

With regard to some of the disassembler threading code, I don't expect anyone to fully grasp its purpose or implementation. It requires a lot of context with both the existing lighthouse codebase, and the disassembler platforms (ida/binja/r2). I will have to evaluate r2/cutter a bit more and work out the appropriate implementation myself.

I don't expect any additional work from you or the r2 guys, but always welcome any additional commits.

Thanks!

@xarkes
Copy link
Contributor Author

xarkes commented Apr 10, 2019

I will try to fix the comments you addressed when I find some time before next week. Thanks!

@gaasedelen gaasedelen added this to the v0.9.0 milestone Apr 10, 2019
@gaasedelen gaasedelen self-assigned this Apr 10, 2019
@xarkes
Copy link
Contributor Author

xarkes commented May 11, 2019

Hey, sorry for the delay I said "before next week", but I think it's passed by now :3
Anyway, hopefully I will find some time in the beginning of June to finish this :)

@gaasedelen
Copy link
Owner

No rush, I haven't had much time to look at it but I will try to work on it some more over the next few days.

I fixed up a bunch of styling issues that cutter exposed, these changes are on dev. The cutter integration should look 'better' once we merge it into dev.

Most of the 'threading' TODOs are a non-issue, I think I removed / cleaned up most of them but haven't committed it yet.

One of the last things I was working on before I stopped was fixing the functionRenamed signal stuff. Hopefully https://github.com/radareorg/cutter/issues/1482 means it should be usable in 1.8.3.

IIRC, there are still a few quirks that I haven't had a chance to look at, such as the graph not automatically refreshing when using the 'hotshell' mode.

@gaasedelen gaasedelen force-pushed the cutter_integration branch from 50ca546 to 0359fc5 Compare May 17, 2019 23:22
@gaasedelen
Copy link
Owner

gaasedelen commented May 17, 2019

I pushed a few more things onto dev and rebased this PR ontop of it with a few more minor updates. Sorry for the force push / dirtying up the PR, hopefully the final squash & merge will clean up the history.

I updated the 'known issues' section of the initial PR. I would like to resolve these before officially merging. Once these are done, we can test it a bit more for general stability.

@ITAYC0HEN
Copy link

Fix functionRenamed() (hopefully with cutter 1.8.2

I think that this is fixed now, isn't it? The related PR is merged on Cutter edge

@xarkes
Copy link
Contributor Author

xarkes commented Jul 30, 2019

Hey there,
Just wanted to say that I did not forget this PR, I'm just busy and I have some more important stuff to take care about right now as well, so that's why I could not improve it yet :)

@XVilka
Copy link

XVilka commented Sep 9, 2019

Are there any updates?

@xarkes
Copy link
Contributor Author

xarkes commented Sep 9, 2019

I started to rebase my branch on develop and updated Cutter, new bugs were introduced and I did not have the time to fix all of them. This is one of my next things to take care of.

@gaasedelen
Copy link
Owner

I have time to work on closing out this PR this week (sept 16 - 20). I am hoping to spend some time on doing the binja overhaul soon (early october?) and so I will probably just merge this is whether or not support is complete.

I re-sorted the TODO list at the top of this PR by what is most likely to get resolved this week / without requiring any additional cutter changes / features to land. I think we will have to forego disas highlighting and the xref features for the time being.

@ITAYC0HEN
Copy link

Hi @gaasedelen :)
Not sure you still actively maintain Lighthouse, it seems like 2019 was a bit dry. But please let us know if you can find the time so we can make the final fixes together and push it.
We will be happy to push it forward and finish the small things that are left to fix :D

@gaasedelen
Copy link
Owner

gaasedelen commented Mar 29, 2020

I've been trying to catch up on lighthouse maintenance. I looked at this branch and was honestly kind of surprised it still worked (granted, I haven't worked on lighthouse much...)

There's still a lot of issues.

  1. I still can't get function rename hooks to trigger? I thought this would have been fixed by #1482?? But maybe something is still goofed up.

  2. Performance is like... really bad. I imagine the bulk of the performance problems are simply the cutter_api.py shims. Specifically in building the metadata cache is where we make heavy use of the disassembler shims to extract information. This means anything called from the _cutter_refresh_ node/function code in metadata.py is going to be the culprit.

  3. ... actually let's just focus on perf for now

To give some perspective, I tested loading the same cutter.exe binary in IDA and Cutter. When attempting to load coverage, Lighthouse is able to build a complete metadata cache of the database in ~2.8 seconds, but running under Cutter it took ~418 seconds.

I can accept shipping a 10-20x slowdown from IDA in an experimental release, but I will not ship Lighthouse at 150x slowdown. That's a shitty user experience.

I think this highlights what I am asking of you guys. I don't my way around the Cutter API (is it even documented ...?) or r2's command schema, so I would appreciate if you could look at cutter_api.py and see what can be improved.

For example, get_functions() is supposed to return a list of all the function starts in the binary, but it takes over 100 seconds on cutter.exe. This shim should be near instant. I suspect get_function_at() would also be a big win, as that will be called repeatedly during the metadata caching process.

Please test your work against a larger executable such as cutter.exe and let me know what you can do.

@Emi1305
Copy link

Emi1305 commented Jul 3, 2020

Hi, I'm trying to continue this pull request as I'd like to use this plugin with Cutter. I've been able to merge the current develop branch with the cutter integration in this pull request and I'm currently trying to adapt it to use the new API (LighthouseCoreAPI and LighthouseContextAPI). But when I try to load a file the following error shows:

Traceback (most recent call last):
File "/home/emid/.local/share/RadareOrg/Cutter/plugins/python/lighthouse/integration/cutter_integration.py", line 41, in interactive_load_file
super(LighthouseCutter, self).interactive_load_file()
File "/home/emid/.local/share/RadareOrg/Cutter/plugins/python/lighthouse/integration/core.py", line 343, in interactive_load_file
lctx = self.get_context(dctx)
File "/home/emid/.local/share/RadareOrg/Cutter/plugins/python/lighthouse/integration/cutter_integration.py", line 32, in get_context
lctx = LighthouseContext(self, dctx)
File "/home/emid/.local/share/RadareOrg/Cutter/plugins/python/lighthouse/context.py", line 30, in __init__
self.metadata = DatabaseMetadata(self)
File "/home/emid/.local/share/RadareOrg/Cutter/plugins/python/lighthouse/metadata.py", line 95, in __init__
self._rename_hooks = disassembler[lctx].create_rename_hooks()
TypeError: create_rename_hooks() missing 2 required positional arguments: 'function_address' and 'new_name'

As it shows the abstract method create_rename_hooks requires a function_address and a new_name parameters. I looked at the binja and IDA api files and neither of them implement this method with that arguments. I don't have any of those tools to check how it's actually working.

Could any of you guys point me in the right direction to fix this?

Edit: Furthermore, similar issue happens with CoverageDatabase contructors and how they are invoked. Is this project working as it is?

@gaasedelen
Copy link
Owner

I've been able to merge the current develop branch with the cutter integration in this pull request and I'm currently trying to adapt it to use the new API (LighthouseCoreAPI and LighthouseContextAPI).

Cool! You're a brave one :-)

Yes those abstractions are quite new and this branch needs to be ported onto them... They were introduced in the recent 'binja refactor' for v0.9. Their purpose is to support the notion of 'multiple databases' open in one process, where each lighthouse 'context' roughly equates to an open database.

As it shows the abstract method create_rename_hooks requires a function_address and a new_name parameters.

You are correct -- api.py is wrong. The correct function signature should be no arguments passed to create_rename_hooks().

I am not sure why Python did not actually throw an abstraction warning, but I guess the metaclassing doesn't actually check the full function signatures (TIL) ?!? I believe I refactored the rename hooking code in the v0.9 update, so that probably explains why the function and comment are stale.

Furthermore, similar issue happens with CoverageDatabase contructors and how they are invoked. Is this project working as it is?

Yes, this project is working fine on IDA/Binja. Do you have an example / callstack of the issue?

Sorry, I know it can get a bit complicated due to the shimming/compat/abstraction. I'll do my best to answer any other questions you may have though.

@Emi1305
Copy link

Emi1305 commented Jul 7, 2020

Hi, I've made some changes based on your last response

You are correct -- api.py is wrong. The correct function signature should be no arguments passed to create_rename_hooks().

I changed the signature in api.py and the project was loaded successfully.

Furthermore, similar issue happens with CoverageDatabase contructors and how they are invoked. Is this project working as it is?

Yes, this project is working fine on IDA/Binja. Do you have an example / callstack of the issue?

Sorry, this was an error on my side, so nothing to worry here

Sorry, I know it can get a bit complicated due to the shimming/compat/abstraction. I'll do my best to answer any other questions you may have though.

Currently I'm running into an issue with the method execute_read as is called in metadata.py:

07-07-2020 13:06:06 |           Lighthouse.Context |   DEBUG: Captured filenames from file dialog:
07-07-2020 13:06:06 |           Lighthouse.Context |   DEBUG:  - /tmp/drcov-test
07-07-2020 13:06:06 |            Lighthouse.STDERR |   ERROR: self.run()
07-07-2020 13:06:06 |            Lighthouse.STDERR |   ERROR:   File "/usr/lib/python3.8/threading.py", line 870, in run
07-07-2020 13:06:06 |            Lighthouse.STDERR |   ERROR: self._target(*self._args, **self._kwargs)
07-07-2020 13:06:06 |            Lighthouse.STDERR |   ERROR:   File "/home/emid/.local/share/RadareOrg/Cutter/plugins/python/lighthouse/util/misc.py", line 56, in wrapper
07-07-2020 13:06:06 |            Lighthouse.STDERR |   ERROR: return f(*args, **kwargs)
07-07-2020 13:06:06 |            Lighthouse.STDERR |   ERROR:   File "/home/emid/.local/share/RadareOrg/Cutter/plugins/python/lighthouse/metadata.py", line 462, in _refresh_async
07-07-2020 13:06:06 |            Lighthouse.STDERR |   ERROR: completed = self._refresh(progress_callback, True)
07-07-2020 13:06:06 |            Lighthouse.STDERR |   ERROR:   File "/home/emid/.local/share/RadareOrg/Cutter/plugins/python/lighthouse/metadata.py", line 508, in _refresh
07-07-2020 13:06:06 |            Lighthouse.STDERR |   ERROR: total = len(function_addresses)
07-07-2020 13:06:06 |            Lighthouse.STDERR |   ERROR: TypeError
07-07-2020 13:06:06 |            Lighthouse.STDERR |   ERROR: :
07-07-2020 13:06:06 |            Lighthouse.STDERR |   ERROR: object of type 'NoneType' has no len()

This seems strange as I put some prints in the get_function_addresses method of the CutterCoreAPI and those are never printed - therefore the method is not called.

I suspect there may be an error in how I'm creating the LighthouseContext in the get_contexts method:

def get_context(self, dctx, startup=True):
    if dctx not in self.lighthouse_contexts:
        # create new 'context'
        lctx = LighthouseContext(self, dctx)
        if startup:
            lctx.start()
        # save the created ctx
        self.lighthouse_contexts[dctx] = lctx
    return self.lighthouse_contexts[dctx]

@gaasedelen
Copy link
Owner

execute_read / execute_write / execute_ui are kind of another tricky part of lighthouse. They serve as normalized versions of 'synchronization' primitives for interfacing with the underlying disassembler in a 'thread' safe manner.

For example, it is dangerous (as in prone to crashes) to read from the IDA database or use some of the IDA API's from a background thread without taking the correct precautions. IDA provides some API's to synchronize access to the database (eg, idaapi.execute_sync(...)) and safely interface with the database from a background thread.

When you see the @disassembler.execute_read decorator over a function, lighthouse will ensure that the function gets executed in a manner that is 'safe' and 'consistent' with that disassembler's implementation. Read is for reading safely reading data from the database, Write is to potentially modify the database, and UI is mostly for Qt related actions.

Lucky for you, r2/cutter doesn't really have any sort of thread safety / synchronziaton mechanisms in place -- so it's all pretty YOLO. If I remember correctly, the execute_read/write primitives I wrote for cutter are just stubs that.. do nothing.

Another point of confusion is that IDA must do basically everything in the mainthread, where Binja requires you to do most database interaction in a 'background threaded worker'. So yes, the implementations can be pretty unique per disassembler...


I started trying to breakdown what could actually be failing with the get_function_addresses, but it's actually hard to say without taking a closer look. There's a lot of moving parts here, and a lot of little things that you have to get just right to get before everything just kind of 'starts to work' integration-wise.

Could you push your work/contributions so I can poke around a bit, and see if I can get your port up to a baseline on the new interfaces?

@Emi1305
Copy link

Emi1305 commented Jul 10, 2020

Thanks for the help. The issue was that I put some methods in the wrong place while migrating to the new CutterContextAPI and CutterCoreAPI

I've uploaded the code to my fork in the cutter_integration branch. Currently I'm working on the UI as the widget is not showing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants