Skip to content

Alamvic/compiler-fuzzing-course

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

61 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Compiler Fuzzing Course

This repository contains the material of a compiler fuzzing course originally developed for the Escuela de Ciencias Informáticas'23 (ECI'23) at Universidad of Buenos Aires.

Modules

The course is divided into several mini-modules:

  • Module 0.1: Testing Refresher. Importance of testing. RIGHT BICEP principle: Right, boundaries, inverse and error conditions, cross-checks, and performance. Automated testing. Unit testing frameworks. Examples: JUnit and SUnit.

  • Module 0.2: Compiler Architecture. Compilation pipelines. Compiler frontend and backend. Parsing, name-resolution, and type checking. Instruction scheduling, register allocation, and code generation. Optimizing compilers, intermediate languages, and optimizations. SSA form. Examples: Clang+LLVM, JVM, and Pharo.

  • Module 1: Compiler Testing 101. The need, importance, and challenges of compiler testing. Strategies for testing: unit testing and regression testing. Example of a register allocator Unit testing. Regression tests as black-box maintainable tests. Parametrized tests. Simulation and emulation for compiler tests.

  • Module 2: Introduction to Fuzzing. Testing with randomly generated data. Fuzzing. Guide fuzzers with domain knowledge. Building an initial fuzzing framework. Evaluating results.

  • Module 3: Grammar-Based Fuzzing. Compilers Phases as fuzzing barriers. Syntactic fuzzing based on grammars. Building a Grammar fuzzer.

  • Module 4: The Oracle Problem. The Oracle problem. Derived, specified, and implicit oracles. Dealing with the absence of oracles. Introducing the Oracle problem for compilers.

  • Module 5: Differential Testing. Comparing existing systems to evaluate behavior. Dealing with the absence of ground truth.

  • Module 6: Cross-Optimization Testing. Obtaining systems to compare by variating compilers. Variate optimizations. Variate architecture.

  • Module 7: Mutation Analysis. Computing test case quality as the ability to detect bugs. Competent developer hypothesis. Simulating developer errors with artificial mutations. Mutation operators, mutations, mutants. Killed and surviving mutants. Mutation score. Equivalent mutants and trivial mutants.

  • Module 8: Mutation for Compiler Testing. Mutants as fuzzers. String-based mutations. Domain/Language-specific mutations. AST-based mutations. Combining fuzzers.

  • Module 9: Interpreter-Guided Compiler Testing. Interpreters as executable semantics. Deriving fuzzers and oracles from interpreters. Concolic testing/execution.

  • Module 10: Equivalence Module Input. Differential testing on inputs. Metamorphic relations. Semantic-preserving mutations as compiler metamorphic relations.

Bibliography

  • A. Zeller, R. Gopinath, M. Böhme, G. Fraser and C. Holler. (2021). The Fuzzing Book. CISPA Helmholtz Center for Information Security. https://www.fuzzingbook.org/. Retrieved 2021-10-26 21:30:20+08:00.
  • Barr, E. T., Harman, M., McMinn, P., Shahbaz, M., & Yoo, S. (2014). The oracle problem in software testing: A survey. IEEE transactions on software engineering, 41(5), 507-525.
  • Claessen, K., & Hughes, J. (2000, September). QuickCheck: a lightweight tool for random testing of Haskell programs. In Proceedings of the fifth ACM SIGPLAN international conference on Functional programming (pp. 268-279).
  • Chen, T. Y., Cheung, S. C., & Yiu, S. M. (2020). Metamorphic testing: a new approach for generating next test cases. arXiv preprint arXiv:2002.12543.
  • Polito, G., Tesone, P., Ducasse, S., Fabresse, L., Rogliano, T., Misse-Chanabier, P., & Hernandez Phillips, C. (2021, September). Cross-ISA testing of the Pharo VM: lessons learned while porting to ARMv8. In Proceedings of the 18th ACM SIGPLAN International Conference on Managed Programming Languages and Runtimes (pp. 16-25).
  • Godefroid, P., Kiezun, A., & Levin, M. Y. (2008, June). Grammar-based whitebox fuzzing. In Proceedings of the 29th ACM SIGPLAN conference on programming language design and implementation (pp. 206-215).
  • Godefroid, P., Klarlund, N., & Sen, K. (2005, June). DART: Directed automated random testing. In Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation (pp. 213-223).
  • McKeeman, W. M. (1998). Differential testing for software. Digital Technical Journal, 10(1), 100-107.
  • Polito, G., Ducasse, S., & Tesone, P. (2022, June). Interpreter-guided differential JIT compiler unit testing. In Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation (pp. 981-992).
  • DeMillo, R. A., Lipton, R. J., & Sayward, F. G. (1978). Hints on test data selection: Help for the practicing programmer. Computer, 11(4), 34-41.
  • Fraser, G., & Zeller, A. (2010, July). Mutation-driven generation of unit tests and oracles. In Proceedings of the 19th international symposium on Software testing and analysis (pp. 147-158).
  • Converse, H., Olivo, O., & Khurshid, S. (2017, March). Non-semantics-preserving transformations for higher-coverage test generation using symbolic execution. In 2017 IEEE International Conference on Software Testing, Verification and Validation (ICST) (pp. 241-252). IEEE.
  • Nagy, S., & Hicks, M. (2019, May). Full-speed fuzzing: Reducing fuzzing overhead through coverage-guided tracing. In 2019 IEEE Symposium on Security and Privacy (SP) (pp. 787-802). IEEE

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published