Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor error handling and logging in '''get_extractor''' - main.py #1845

Closed
wants to merge 5 commits into from

Conversation

aaronatp
Copy link
Contributor

@aaronatp aaronatp commented Nov 13, 2023

Checklist

  • No CHANGELOG update needed
  • No new tests needed
  • No documentation update needed

This PR is to hide the '''get_extractor''' error handling and logging in a decorator. This makes the main function a bit cleaner and reduces the boilerplate code used to log errors.

@mr-tz
Copy link
Collaborator

mr-tz commented Nov 13, 2023

Please ensure the CI checks and tests pass.

@aaronatp
Copy link
Contributor Author

@mr-tz will do!

@aaronatp
Copy link
Contributor Author

Hi @mr-tz the '''log_unsupported_format_error''' function, which is one of the functions that the '''catch_log_return_errors''' decorator is meant to replace, is used once elsewhere in the '''main''' function for error handling. As a result, there is still a lot of boilerplate code in helpers.py. I was going to see if I could refactor the error handling in that other location. Would there be any problems with that?

@williballenthin
Copy link
Collaborator

I realize @aaronatp and @mr-tz have been having ongoing discussions, but I must note that am not a fan of using a decorator to wrap exceptions like this. It is a clever piece of code - and that's my concern :-) This code doesn't look like other code in capa nor around the general Python ecosystem, so it's hard to follow.

I think we could accomplish the same by wrapping the get_extractor function in another function, like:

def get_extractor_2(...):
    try:
        return get_extractor(...)
    except A:
        ...
    except B:
        ...

That being said, I'm also not yet convinced that this PR is making the code simpler. I don't want to make a final decision yet, but I do want to make clear that I'm not sure I'll support changes like this. I'll chat with @mr-tz to hear more about the discussions & plans you all have made.

@aaronatp
Copy link
Contributor Author

aaronatp commented Nov 13, 2023

Hi @williballenthin thank you for explaining your thoughts - if you don't decide to merge this PR, that's ok! One of my concerns when writing the decorator was that some people would find it difficult to read because decorators are a somewhat uncommon code construct.

I also think the list unpacking in the current draft reads relatively poorly. I have been trying to think of a better way of unpacking all three variables (error_list, return_values, message_list) simultaneously but this may not be achievable in a straight-forward manner. I have also been thinking about how to make the logging_wrapper function less choppy.

I think the wrapper function you suggest could work well. I'll implement it and see if I can cut down on the logging code :)

@aaronatp aaronatp closed this Nov 14, 2023
@williballenthin
Copy link
Collaborator

thanks for your understanding @aaronatp and i look forward to continuing these discussions and experiments :-)

i think a major metric for assessing the changes will be how they simplify the code that users write for accessing capa. for example, there are a bunch of scripts in capa/scripts that import functionality from capa.main and reproduce a lot of capa.main.main. these scripts are probably similar to programs that other people write with capa. so, the refactoring that we do here should result in a lot of code cleanup (both deleting lines and making the remaining code more readable) in those scripts. if thats the case, then we can be confident that our users' lives are easier/better.

@aaronatp
Copy link
Contributor Author

@williballenthin thank you for explaining why some changes may be more beneficial than others. I didn't previously realize that functions like get_extractor are called directly by users - what you describe sounds like a very sensible metric. I will have a look in capa/scripts to better understand the considerations that go into simplifying and cleaning up different functions' code here :)

@mr-tz
Copy link
Collaborator

mr-tz commented Nov 15, 2023

Hey @aaronatp thanks for your understanding and @williballenthin's reasoning makes sense here. I hope you still gained some more with the code and want to continue contributing.

I went through our long list of issues and identified a few for you. Some more refactoring PRs to work on next could be:

If you want to start working on some other enhancements instead, there are for example:

Please read through them and ask all questions you may have in the respective issue.

Thank you!

@aaronatp
Copy link
Contributor Author

Hi @mr-tz thank you for suggesting these open issues, I will have a look at them! I did gain more knowledge of the code from these attempts and would like to continue trying to contribute :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants