Fix Docker build, generate patches for plausible invariants (#1)

* Doc: move old instructions to another place * Generate patch for plausible but overfitting invariants as well * Dockerfile: fix z3 install error * Try to bump up python version * Fix Dockerfile * Fix minor error * Print more info and update README
yuntongzhang · Dec 21, 2023 · e99a22e · e99a22e
1 parent fc0ce8a
commit e99a22e
Show file tree

Hide file tree

Showing 9 changed files with 269 additions and 169 deletions.
diff --git a/Dockerfile b/Dockerfile
@@ -1,11 +1,7 @@
 FROM ubuntu:18.04
 
-RUN apt clean
-RUN apt update
+RUN apt clean && apt update
 RUN DEBIAN_FRONTEND=noninteractive apt install -y build-essential curl wget software-properties-common llvm
-# add this for installing latest version of python3.8
-RUN add-apt-repository ppa:deadsnakes/ppa
-RUN apt update
 
 # install elfutils
 RUN DEBIAN_FRONTEND=noninteractive apt install -y unzip pkg-config zlib1g zlib1g-dev autoconf libtool cmake
@@ -25,17 +21,26 @@ RUN DEBIAN_FRONTEND=noninteractive apt install -y git vim python3-pip gdb \
 
 RUN DEBIAN_FRONTEND=noninteractive apt install -y clang-10
 
-# install python3.8 and the libraries we need
+# install a newer version of cmake, since it is required by z3
+RUN DEBIAN_FRONTEND=noninteractive apt-get install --yes --no-install-recommends wget
+RUN wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | gpg --dearmor - | tee /etc/apt/trusted.gpg.d/kitware.gpg >/dev/null
+RUN DEBIAN_FRONTEND=noninteractive apt purge --yes --auto-remove cmake && \
+    apt-add-repository "deb https://apt.kitware.com/ubuntu/ $(lsb_release -cs) main"  && \
+    apt update && \
+    apt-get install --yes --no-install-recommends cmake
+
+# install python3.8, for driver scripts of the project
 RUN DEBIAN_FRONTEND=noninteractive apt install -y python3.8
-RUN python3.8 -m pip install toml pyparsing z3-solver libclang
-RUN python3 -m pip install toml pyparsing
 
 # build the project
 COPY . /home/yuntong/vulnfix/
 WORKDIR /home/yuntong/vulnfix/
 RUN git submodule init
 RUN git submodule update
-# build is slow within docker build, so just build inside container
+RUN python3.8 -m pip install -r requirements.txt
+# required for building cvc5 (default python3 is 3.6)
+RUN python3 -m pip install toml pyparsing
+# NOTE: this might be slow
 RUN ./build.sh
 
 ENV PATH="/home/yuntong/vulnfix/bin:${PATH}"

diff --git a/README.md b/README.md
@@ -17,6 +17,11 @@ the vulnerable states, which can be used to generate a patch later on.
 
 ## Getting started
 
+_New changes has been added to VulnFix since the ISSTA22 publication. To get the version during
+ISSTA22 period and steps for using that version, please refer to [ISSTA22.md](doc/ISSTA22.md)._
+
+> TODO: Add getting started instruction for the new tool version.
+
 Firstly, certain OS configurations are required to be set for VulnFix and its dependencies (e.g. AFL).
 To set these, run:
 
@@ -30,35 +35,27 @@ echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
 
 The VulnFix tool and its dependencies are available in docker container. (Please refer to
 [doc/INSTALL.md](doc/INSTALL.md) for instructions on building it from source.)
+
 To start:
 
 ```bash
-docker pull yuntongzhang/vulnfix:issta22
-docker run -it --memory=30g --name vulnfix-issta22 yuntongzhang/vulnfix:issta22
+docker pull yuntongzhang/vulnfix:latest-manual
+docker run -it --memory=30g --name vulnfix yuntongzhang/vulnfix:latest-manual
 ```
 
-Once inside the container, navigate to the VulnFix directory and invoke it on CVE-2012-5134:
+Once inside the container, invoke it on one example (e.g. CVE-2012-5134) with:
 
 ```bash
+# clone and build the target project
+cd /home/yuntong/vulnfix/data/libxml2/cve_2012_5134
+./setup.sh
+# run vulnfix to repair
 cd /home/yuntong/vulnfix
-python3.8 src/main.py data/libxml2/cve_2012_5134/config
+vulnfix data/libxml2/cve_2012_5134/config
 ```
 
-AFL should be started after a shorting period of time of parsing the config file and setting up the
-runtime directory. The snapshot fuzzing stage will follow. The total time taken for this command
-is roughly 12-15 minutes, and the final few lines printed on screen should be something like this:
-
-```
-2022-05-24 05:40:33 --- Final patch invariants - #(1) : ['len >= 1'] ---
-
-2022-05-24 05:40:33 Generating patch from the patch invariant `len >= 1` ...
-2022-05-24 05:40:41 Patch generation successful! Please find the patch at: /home/yuntong/vulnfix/data/libxml2/cve_2012_5134/runtime/vulnfix.patch.
-```
-
-This indicates a successful run of VulnFix, with a single patch invariant `len >= 1` produced in the
-end. A patch file is also generated based on this invariant, at the location:
-`/home/yuntong/vulnfix/data/libxml2/cve_2012_5134/runtime/vulnfix.patch`.
-
+After VulnFix finishes, the results (generated invariants and patches) can be found in
+`/home/yuntong/vulnfix/data/libxml2/cve_2012_5134/runtime/result/`.
 
 ## Documentation
 

diff --git a/doc/ISSTA22.md b/doc/ISSTA22.md
@@ -0,0 +1,47 @@
+# Getting started steps (for the version during ISSTA 22)
+
+_This is the instruction for running VulnFix on one example. The full steps for ISSTA22 artifact
+evaluation is at [doc/AE.md](doc/AE.md)
+
+Firstly, certain OS configurations are required to be set for VulnFix and its dependencies (e.g. AFL).
+To set these, run:
+
+```bash
+echo core | sudo tee /proc/sys/kernel/core_pattern
+cd /sys/devices/system/cpu
+echo performance | sudo tee cpu*/cpufreq/scaling_governor
+
+echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
+```
+
+The VulnFix tool and its dependencies are available in docker container. (Please refer to
+[doc/INSTALL.md](doc/INSTALL.md) for instructions on building it from source.)
+
+To start:
+
+```bash
+docker pull yuntongzhang/vulnfix:issta22
+docker run -it --memory=30g --name vulnfix-issta22 yuntongzhang/vulnfix:issta22
+```
+
+Once inside the container, navigate to the VulnFix directory and invoke it on CVE-2012-5134:
+
+```bash
+cd /home/yuntong/vulnfix
+python3.8 src/main.py data/libxml2/cve_2012_5134/config
+```
+
+AFL should be started after a shorting period of time of parsing the config file and setting up the
+runtime directory. The snapshot fuzzing stage will follow. The total time taken for this command
+is roughly 12-15 minutes, and the final few lines printed on screen should be something like this:
+
+```
+2022-05-24 05:40:33 --- Final patch invariants - #(1) : ['len >= 1'] ---
+
+2022-05-24 05:40:33 Generating patch from the patch invariant `len >= 1` ...
+2022-05-24 05:40:41 Patch generation successful! Please find the patch at: /home/yuntong/vulnfix/data/libxml2/cve_2012_5134/runtime/vulnfix.patch.
+```
+
+This indicates a successful run of VulnFix, with a single patch invariant `len >= 1` produced in the
+end. A patch file is also generated based on this invariant, at the location:
+`/home/yuntong/vulnfix/data/libxml2/cve_2012_5134/runtime/vulnfix.patch`.
diff --git a/requirements.txt b/requirements.txt
@@ -0,0 +1,4 @@
+libclang==14.0.1
+pyparsing==3.0.8
+toml==0.10.2
+z3-solver==4.8.17.0
diff --git a/src/backend.py b/src/backend.py
@@ -1,7 +1,7 @@
 from abc import ABC, abstractmethod
 import subprocess
 import re
-from sys import stdout
+from typing import List
 
 from logger import logger
 from utils import *
@@ -32,7 +32,7 @@ class DaikonBackend(BackendBase):
     def __init__(self):
         super().__init__()
 
-    def run(self):
+    def run(self) -> List[str]:
         """
         :returns: A list of invariants, a list of variables appeared in invariants.
                 If there is no output, returns two empty lists.
@@ -66,7 +66,7 @@ def run(self):
 
         return invariants
 
-    def __filter_daikon_invariants(self, invs):
+    def __filter_daikon_invariants(self, invs: List[str]) -> List[str]:
         """
         Some daikon invariants are complicated to turn off from Daikon configs.
         We filter them out here.
@@ -82,7 +82,7 @@ def __filter_daikon_invariants(self, invs):
         return filtered_invs
 
 
-    def __sanitize_daikon_invariants(self, invs):
+    def __sanitize_daikon_invariants(self, invs: List[str]) -> List[str]:
         """
         Daikon output is formatted in java. Here we sanitize them to format that
         can be handled by z3 in python, and also can be use to generat patch in C.
@@ -122,7 +122,7 @@ def __sanitize_daikon_invariants(self, invs):
 
         return sanitized_invs
 
-    def __remove_duplicated_invariants(self, invs):
+    def __remove_duplicated_invariants(self, invs: List[str]) -> List[str]:
         """
         Daikon can produce semantically equivalent invariants.
         This method detects the duplicates and only keeps one of them.
@@ -184,7 +184,7 @@ def generate_input_from_snapshots(self):
             f.write(fail_res)
 
 
-    def __convert_vars_into_decls(self, vars):
+    def __convert_vars_into_decls(self, vars: List[str]) -> str:
         res = "\n\nppt ..fix_location():::ENTER\n"
         res += "\n\nppt ..fix_location():::EXIT\n"
         res += "  ppt-type point\n"
@@ -263,7 +263,7 @@ def __init__(self):
         pass
 
 
-    def run(self):
+    def run(self) -> List[str]:
         logger.info('Running cvc5 for inference. This make take a while ...')
         cmd = [values.full_cvc5, "--sygus-arg-relevant", "--sygus-eval-opt",
             "--sygus-grammar-norm", "--sygus-min-grammar",
@@ -289,7 +289,7 @@ def run(self):
         return [inv]
 
 
-    def __sanitize_cvc5_invariant(self, invariant):
+    def __sanitize_cvc5_invariant(self, invariant: str) -> str:
         inv_tokens = invariant.strip().split()
         # change = to ==
         inv_tokens = [ '==' if t == '=' else t for t in inv_tokens ]

diff --git a/src/ce_refiner.py b/src/ce_refiner.py
@@ -3,17 +3,20 @@
 from ce_single_var import *
 from subroutines import *
 from snapshot import *
+from backend import BackendBase
+from typing import List
 
 EARLY_TERM_THRESHOLD = 5
 
 class CeRefiner(object):
-    def __init__(self, exprs, inputs_pass, inputs_fail, backend):
+    def __init__(self, exprs: List[str], inputs_pass, inputs_fail, backend):
         """
         :param exprs: list of candidate expressions (constraints)
         :param inputs_pass: list of passing test inputs
         :param inputs_fail: list of failing test inputs
         """
         self.round = 0
+        # always keep the current candidate invs from the current round
         self.candidate_exprs = exprs
         self.consecutive_same_count = 0
         # all the inputs given
@@ -22,7 +25,7 @@ def __init__(self, exprs, inputs_pass, inputs_fail, backend):
         # record which inputs have not been used
         self.untouched_inputs_pass = set(inputs_pass)
         self.untouched_inputs_fail = set(inputs_fail)
-        self.backend = backend
+        self.backend: BackendBase = backend
         self.__refresh_driver_tests()
 
     def __refresh_driver_tests(self):
@@ -91,7 +94,7 @@ def one_step_refinement(self, max_iter=6):
                 f' The most recent patch invariants are: {[e for e in self.candidate_exprs]}.\n')
             return candidate_exprs
 
-        # update refiner attributes
+        # update refiner attributes (set our own internal state)
         self.round += 1
         if candidate_exprs == self.candidate_exprs:
             self.consecutive_same_count += 1