Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix & update benchmarks and fuzzers #1541

Merged
merged 59 commits into from
Nov 22, 2022
Merged

Conversation

DonggeLiu
Copy link
Contributor

@DonggeLiu DonggeLiu commented Nov 4, 2022

  1. Fix and update (to the latest tag/release) the (nontrivial) benchmarks and fuzzers to adapt to Ubuntu:20.04 and Python3.10.8.
  2. Delete the trivial/buggy benchmarks and fuzzers.
    Base: Upgrade base images, benchmark images, and Python. #1526

Benchmarks

'[x]' means it can build and run under the new environment.

Support

Deprecate

Nothing so far, may add later.

Undetermined

  1. poppler
    • Failed to build in FuzzBench and OSS-Fuzz (CI cancelled).
    • Can add it back when it can build.
    • Can deprecate if it continues fail to build.

Fuzzers

Fix and update (to the latest tag/release) the (nontrivial) fuzzers to adapt to Ubuntu:20.04 and Python3.10.8.
Maybe also delete the outdated/trivial/buggy ones?

We want to support
([x] means it is up-to-date and passed the standard and oss-fuzz categories of CI tests and did not break in the bug category before the 5-hour timeout):

  • afl
  • aflfast
  • afl++
  • aflsmart
  • centipede (failing due to the weak reference issue, which should be acceptable?)
  • eclipser
  • fairfuzz
  • honggfuzz
  • libafl
  • libFuzzer
  • mopt
  • klee (Not sure if we want to remove it. It is actively maintained, but it seems we do not use it in our reports for some reason).
  • symcc_aflplusplus (I want to add SymCC-related fuzzers, as SymCC supports concolic execution and seems impactful. But I failed to fix its errors after trying for two days. SymCC still relies on Clang-10 and Python2, yet it also tries to support LLVM-15 in very recent PRs. Maybe let's wait for a while until it can get rid of the ancient dependencies and becomes stable on LLVM-15?)

Not used as default fuzzers in general comparison evaluations

  • centipede_function_filter (Do we use it for daily fuzzing evaluations?)
  • introspector_driven_focus (Do we use it for daily fuzzing evaluations?)

Deprecate

([x] means it is up-to-date and passed the standard and oss-fuzz categories of CI tests and did not break in the bug category before the 5-hour timeout):

  • All variations
  • entropic
  • neuzz
  • pythia
  • fafuzz
  • tortoisefuzz
  • wingfuzz
  • weizz
  • fuzzolic_aflplusplus_z3
  • nautilus
  • gramatron
  • token_level
  • afl_2_52_b
  • libfuzzer_dataflow
  • lafintel (Do we want to keep this? It has not been updated in the past 6 years.)

Undecided

Nothing so far.

@DonggeLiu
Copy link
Contributor Author

I noticed that we are using this repo for SymCC instead of the official repo.
Currently, it fails to build and I am planning to fix it.
Shall I continue using that repo or switch to the official one? Thanks! @jonathanmetzman

@DonggeLiu
Copy link
Contributor Author

Which AFL shall we keep?
I noticed that we have AFL (which is archived) and AFL-2.52b (which has not been updated in 2 years).
Shall we keep both of them or stop supporting one of them?

@DonggeLiu
Copy link
Contributor Author

DonggeLiu commented Nov 6, 2022

I reckon the bug category is still needed by the competition, maybe we exclude them from daily experiment and CI tests, and only keep them for the competition?

I've tried to make bug benchmarks compatible with libfuzzer in Ubuntu:20.04 and Python-3.10.8. CI tests were not able to test them all due to the 5-hour time limit.

@DonggeLiu
Copy link
Contributor Author

DonggeLiu commented Nov 17, 2022

Interesting. GitLab GNOME packages (e.g. libxml2, libxslt) are not available.
Our benchmarks are broken because of it.

@DonggeLiu
Copy link
Contributor Author

DonggeLiu commented Nov 18, 2022

The general rule I followed is to consider libFuzzer as the default fuzzer and libpng-1.6.38 as the default benchmark.
If a benchmark B can build with libFuzzer then we can keep B, and if a fuzzer A failed to build with B, then we add A to the list of unsupported_fuzzers in B's benchmark.yaml.
Similarly, if a fuzzer A can build with libpng-1.6.38, then we keep A, and if A cannot build with another benchmark B, then we add A to the list of unsupported_fuzzers in B's benchmark.yaml.

In addition, we deprecate a fuzzer A is A is no longer maintained or relies on very old dependencies.

@vanhauser-thc
Copy link
Collaborator

Which AFL shall we keep? I noticed that we have AFL (which is archived) and AFL-2.52b (which has not been updated in 2 years). Shall we keep both of them or stop supporting one of them?

IMHO the archive one should be kept because still people like to reference against it. 2.52b should be dropped though.

@DonggeLiu
Copy link
Contributor Author

Which AFL shall we keep? I noticed that we have AFL (which is archived) and AFL-2.52b (which has not been updated in 2 years). Shall we keep both of them or stop supporting one of them?

IMHO the archive one should be kept because still people like to reference against it. 2.52b should be dropped though.

Yep, thanks!
We think the same: This PR keeps AFL and drops AFL-2.52b : )

Are there any other fuzzers that you would suggest keeping/dropping?

@vanhauser-thc
Copy link
Collaborator

I think lafintel, mopt, aflfast and fairfuzz can be dropped. aflsmart is also pretty outdated, but its one of the few structured fuzzing implementations so that is why I would keep it.

@jonathanmetzman
Copy link
Contributor

Interesting. GitLab GNOME packages (e.g. libxml2, libxslt) are not available. Our benchmarks are broken because of it.

Let me close and reopen and see if that fixes things

@DonggeLiu
Copy link
Contributor Author

Interesting. GitLab GNOME packages (e.g. libxml2, libxslt) are not available. Our benchmarks are broken because of it.

Let me close and reopen and see if that fixes things

Ah sorry, that was fixed automatically on the following day.
They were only down for one night.

@DonggeLiu
Copy link
Contributor Author

I think lafintel, mopt, aflfast and fairfuzz can be dropped. aflsmart is also pretty outdated, but its one of the few structured fuzzing implementations so that is why I would keep it.

Thanks!
Yeah, lafintel has not been maintained for years, I excluded it in this PR.
I will have another PR dedicated to excluding more of those fuzzers (and benchmarks) later.

@DonggeLiu
Copy link
Contributor Author

OK, I will merge this back to its base branch as discussed.
Will have another PR to remove redundant fuzzers/benchmarks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants