Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EXPERIMENTAL] Syntactic and Semantic Mutations #3

Draft
wants to merge 114 commits into
base: constraint-mutations
Choose a base branch
from

Conversation

FedeLoch
Copy link

@FedeLoch FedeLoch commented Sep 24, 2024

Syntactic and Semantic Mutations

After analysing some ideas about possible constraint mutations in the contraposition of how the Concolic algorithm works, we realized there is a high risk of mutating the current path to explore. Our current Concolic algorithm works iteratively, using the previously explored path + constraints to solve those constraints and leads to the next path to explore. If we mutate the current path, the next one could be corrupted by our mutation.

Having in mind this impact, we decided to take another approach. We propose to leave the Concolic algorithm to make its explorations, finding and solving all the paths, and just before using those paths to test the compiler mutate them.

The problem here is that our mutations must be made delicately, if we want to ensure the same behaviour between the interpreter and our compiler, we need to design and implement mutations that don't change the semantics of the explored path.

This approach brought us to consider two possible scenarios, the first one is if we desire to test relief of the correct behaviour between interpreter/compiler, we want to check that the compiler behaviour differs from the bytecode interpreter, and then the mutations mustn't change that behaviour. But at the same time, we may indeed, consider a second scenario of mutating the generated paths semantics consciously and therefore look at how often the compiler behaviour differs of our bytecode interpreter.

Methodology/Workflow

We implemented two methods

  • RABytecodeAutoTest solutionsFor: aBytecode ( which returns all the possible solutions for a given bytecode )
  • RAPrimitiveAutoTest solutionsFor: aPrimitive ( which returns all the possible solutions for a given primitive )

With these two methods, we pretend to capture all solutions regarding a specific bytecode or primitive. This allows us to understand the common constraint's paths generated for certain bytecode/primitive and use them to design mutations over them.

Our main idea is to add new constraint mutations forcing the constraint solver to explore new possible values for the implied variables, keeping the semantics to avoid inconsistencies or changing it in case we wanted to confirm behaviour differences.

Compiler's Coverage

Part of our goal is to guarantee that we are testing the semantic equivalence between our interpreter and the compiler, we want to make sure we are increasing the code coverage by adding these new path mutations.

Before our mutation experiments, we saw that the percentage of code covered by our Concolic exploration using the RABytecodeAutoTest is 13%

Screenshot 2024-10-11 at 14 51 21

And, using the RAPrimitiveAutoTest 18%

Screenshot 2024-10-11 at 15 04 11

And running both test suits we get 22 % of compiler coverage

Screenshot 2024-10-12 at 13 30 05

Which means that we need to get another way to explore more compiler paths.

Syntactic Mutations

The idea of these mutations is to add/remove/update the current path ( with its constraints ) but keep the semantics, for that reason, we are going to mutate the constraints into it with the expectation of seeing differences that we are not able to see with the deterministic algorithm.

We are starting with the simple semantic equivalence EDITION constraints:

  • Var(A) > n -> Update to Var(A) >= n+1

  • operand1 OP operand2 -> Update to operand2 OP operand1 [ as log as OP be conmutative ]

  • Var(A) < n -> Update to Var(A) <= n-1

  • n < Var(A) -> Update to Var(A) >= n+1

  • Var(A) >= n -> Update to Var(A) = n OR Var(A) > n

To force the constraint solver to assign more interesting values to our variables, we propose the next INSERTION constraints as mutations:

  • isX(A) -> Add isNotY, for all Y != X

  • Var(A) >= n -> Add Var(B) tal que Var(A) + Var(B) > Var(B) + n

Considering randomness
  • IsNotInt(A) -> Add A != rand(-inf, +inf)

  • Var(A) AOP n) -> Add Var(A) + X AOP n + X Where X = Random(-inf, +inf)

The syntactic mutations that we :

// TODO

The semantic mutations added are:

// TODO

Design decisions

// TODO

Tests and observations

// TODO

Conclusions

// TODO

@FedeLoch FedeLoch changed the title [EXPERIMENTAL] SyntacticMutation model [EXPERIMENTAL] Syntactic and Semantic Mutations Sep 26, 2024
@FedeLoch FedeLoch requested a review from guillep September 26, 2024 14:35
@FedeLoch FedeLoch self-assigned this Sep 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants