-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python Coverage Question #11215
Comments
Could you clarify how you generate the coverage report locally? The commands specifically? And which project is this? |
@DavidKorczynski This one is specifically icalendar, and I'm following the guidance on the atheris documentation here |
Okay, they use the same approach. FYI you can generate coverage easily by way of OSS-Fuzz, simply run the command: python3 infra/helper.py introspector icalendar This will produce a default corpus generation and code coverage report generation, but you can adjust e.g. to run fuzzers for longer to collect corpus or rely on using an existing corpus. The reason you see extra packages in your code coverage reports on OSS-Fuzz is that OSS-Fuzz relies on bundling your fuzzers + target code by way of
I don't think it should filter them out though, largely because I consider 3rd party deps to be a part of your application just as much as the code you've written yourself. I don' think we should exclude it for the Q2 above. |
@DavidKorczynski Thank you for the clarification and the tip on coverage generation, I'll be sure to use that moving forward. Based off my experience in harnessing Python projects, a lot of Python libraries utilize tons of dependencies for a very small subset of their features. In the case of libraries that import numpy or pandas, for example, I don't think it's possible to achieve 50% code-coverage if the percentage is heavily swayed by tens of thousands of lines of code that aren't even reachable from the library being tested. Do you have any tips for handling these cases and how to approach the coverage situation for these? Thank you for your help and advice! |
I don't see the ability to cover code as changing just because we filter stuff out -- I get the idea that if we filter out stuff the number reported at the top of the code coverage reports showing the accumulated coverage of all files in the report will go up, but the actual code being covered remains the same. This seems to be more related to Q2 above? If so, then I think I'd prefer to keep coverage as is and just provide argumentation for why code coverage is X percentage (not necessarily as the coverage report is), perhaps with arguments leaning towards code coverage of attack surface. FWIW this is not a problem unique to Python, all of the other languages have the same issue. See e.g the Envoy code coverage report where at least 1.2 million lines of code are external depencies external and external -- some is bloat and some isn't, but very likely it's impossible to get the code coverage report to 100%.
The problem is this can be tricky to assess. For some cases there may be obvious bloat, but for others it may not be obvious, and in many cases we are quite interested in understanding what part of 3rd party deps has coverage. For example, say a given project relies on a 3rd party image parsing/serialization library, then we definitely want to know the coverage of this parsing logic, which if we filter out likely implies we are missing insight into an important part of the attack surface. We could show this as follows: def parse_user_input(raw_user_input):
serialized_inp = some_third_party_serialization_lib.parse_raw_data(raw_user_input)
... In this case getting 100% coverage of the line Naturally, in many cases, Fuzz Introspector performs reachability analysis to assist in this process, but the dynamic nature of Python makes the problem particularly difficult. I think this also speaks to not filtering out stuff as a general solution is likely to exclude code that is reachable/on the attack surface. For the purposes of tracking code coverage and using it as an estimate for completion analysis I think it would be nice to have improved insights into code coverage. I think assessing code coverage on a folder-level helps with this, but it will still be subject to the limitation mentioned above. Again, to Q2 above I think it would be most complete to have some more qualitative argument as to why code coverage is X percent. |
There is a proposed solution to the same problem in Java here: #10860 In essence we have the same option in C/C++ because we can control which code is introspected with code coverage instrumentation. Let me see about the option for adding prefixing capabilities to Python code coverage. |
Hello,
I'd appreciate some more clarity on how ClusterFuzz calculates Python coverage. If I run coverage locally, I get a coverage above 50% for a project I am working on. However, on ClusterFuzz, that same project has a coverage report at 37% due to the inclusion of extraneous site-packages used by the library being fuzzed.
So, my questions are the following
Thank you for your review and help!
The text was updated successfully, but these errors were encountered: