-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backend Rust/Foreign Language Support #69
base: master
Are you sure you want to change the base?
Conversation
this is due to linking issues with rust
instead of a dynamic library
Hi,
Thanks. |
Thanks for taking a look :)
This will be difficult, as the 'rust backend' code added in this repository delegates the solving, etc. to the backend that is implemented in a foreign language (and for which no 'canonical' version exists at this point). I presume most tests are irrelevant in this case. Note, that the 'rust backend' in itself is also not usable on it own, because it compiles to a static archive instead of shared library, so it is not directly compatible with the interface that SymCC (and SymQEMU) expect. The symcc_runtime crate in the LibAFL repo takes care of giving the user an actual interface for implementing the runtime and enables the production of a shared library that can be used as a SymCC backend. (This is achieved by linking against the static archive that is generated by building the rust backend in this repo, re-exporting the standard the SymCC interface from it and implementing the foreign symbols). Finally, there is a smoke test for this whole process in the LibAFL repo.
I'm writing up some more extensive documentation for LibAFL users in the coming days (this is the basically the last missing piece for this whole project :) )
Exactly. SymQEMU is supported as well, because they use the same interface. Garbage Collection, which, AFAIU is required for the massive amounts of expressions that the SymQEMU fronted produces, is implemented as part of this PR.
Yes. Here is a the runtime used in the aforementioned smoke test, which traces the calls made to the backend for processing outside of the traced target, and here is a backend that is used as part of an example hybrid fuzzer based on LibAFL. Note that, the runtimes that I linked to make use of the pre-built components that come with the Cheers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a very interesting project! (Apologies for the late reaction...) I've added a few suggestions inline.
Just out of curiosity: Why did you decide to do target -> SymCC runtime -> Rust code
rather than target -> Rust code -> parts of SymCC runtime used as helpers
? I'm not suggesting that one is better than the other, just curious :)
In principle, I have no objections against a backend that delegates to foreign-language implementations. But I agree with @aurelf that it would be nice to have a way to make sure we don't break implementations, other than receiving bug reports from after the fact :P Do you think it's feasible to express the simple backend in terms of your generic backend as I suggested in the comments? That would give us a nice way to make sure we don't break anything...
@@ -35,7 +36,9 @@ set(SHARED_RUNTIME_SOURCES | |||
${CMAKE_CURRENT_SOURCE_DIR}/Shadow.cpp | |||
${CMAKE_CURRENT_SOURCE_DIR}/GarbageCollection.cpp) | |||
|
|||
if (${QSYM_BACKEND}) | |||
if (${RUST_BACKEND}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: I think it would make sense to rename the backend in a way that prevents confusion with a possible future backend for analyzing Rust programs (as opposed to your approach of enabling a backend implemented in Rust).
#endif | ||
|
||
/// The set of all expressions we have ever passed to client code. | ||
std::set<SymExpr> allocatedExpressions; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be very nice if we could refactor the parts that are the same in the simple backend and your new backend into a shared header or something of the sort, in order to avoid code duplication. We could even go as far as expressing the simple backend as an implementation of your interface - which would nicely solve @aurelf's demand for a testable version of the new backend.
Just a quick note: I haven't forgotten about this, just very busy at this time. I'll come back to this as soon as possible! |
That's a quite curious design decision indeed. It follows from the following requirements:
It is possible to link the rust parts of the backend into the final shared object using (Corrosion) If we now want to invert the dependency between CMake and Cargo (ie. call cmake from cargo), then we can't have cargo call into cmake to do the entire build. Instead, we need to build a static library out of the code that we want from the SymCC runtime and link that into the final shared object, which is now built and linked by cargo. This is where requirement 2. comes in. Since we can't simply re-export the symbols from the SymCC/C++ part of the backend, we need to define the required symbols in our rust crate. Therefore, in the I'm kind of annoyed that I can't seem to find a more concise explanation, but the issues are kind of intertwined in a weird way. Hope it is understandable. Also, another goal was to modify as little existing code as possible from SymCC as to be able to easily maintain a fork in case these changes will not be merged.
The way that the C++ part of the backend (specifically the implementation of _sym_bits_helper) is implemented at the moment means that 32-bit targets can't be supported. (In short: the bit-width of all expressions is stored in the least significant byte of the "SymExpr" (ie. pointer-sized) type. On 32-bit, I believe this leaves too few bits for the actual address.). To come back to the issue of testability: In general, the code in this PR doesn't implement a backend, as discussed before. What it does implement, however, is this whole forwarding business. Therefore my suggestion would be to test the forwarding only decoupled from the actual backend logic. For example, a 'tracing' backend could be implemented that simply outputs the calls that were made to it in text format to stdout. A test script could then ensure the correct sequence of calls were made to the backend. In fact, this is basically what the tests inside the LibAFL repo do at this point: https://github.com/AFLplusplus/LibAFL/tree/main/libafl_concolic/test . Of course, in my humble opinion, the current C++ parts of the backend (not the LLVM pass of course) should simply be implemented in Rust, providing a FFI compatible interface like the one in the |
…symcc fuzzing helper (#1)
* This is a temporary fix due to std::iterator depercation. This commit needs to be reverted once a proper fix is in place. * symcc_fuzzing_helper: Move to clap3 (eurecom-s3#94) * Revert "symcc_fuzzing_helper: Move to clap3 (eurecom-s3#94)" (eurecom-s3#101) This reverts commit 88b464c. * Add some FAQs to the Readme * changed from structopt to clap 3 (eurecom-s3#103) * fix for issue eurecom-s3#108 * fix for issue eurecom-s3#108 * LLVM 12 works without changes * Add a clang-format configuration This is just the output of "clang-format -style=llvm -dump-config". * Add support for LLVM 13 Clang now uses the new pass manager for the optimization pipeline, so we have to do the same to make Clang use our pass. Moreover, FileCheck now complains if a configured prefix doesn't appear in the checked file; added "ANY" in three tests where it was missing. Finally, printing arbitrary-precision integers in QSYM needed some changes. * Add support for LLVM 14 * LLVM 15 works without changes * fix issue eurecom-s3#109 * Run clang-format We should really automate this... * Add a GitHub action that checks LLVM compatibility * Prevent test failures in case of reordered solver output Z3 doesn't always output model constants in the same order; make sure that our tests don't depend on it. * Accept symbolic input from memory This commit adds the option to mark symbolic input by calling symcc_make_symbolic from the program under test. The refactoring that was required to add the new feature has had the pleasant side effect that the QSYM backend now doesn't require the entire input upfront anymore, making it much more convenient to feed symbolic data through stdin. * Run GitHub actions for pull requests only No need for "push": the "pull_request" event already triggers when new commits are pushed to the PR branch, and we expect all changes to go through a PR. Co-authored-by: Aurelien Francillon <[email protected]> Co-authored-by: Dominik Maier <[email protected]> Co-authored-by: aurelf <[email protected]> Co-authored-by: Dominik Maier <[email protected]> Co-authored-by: Emilio Coppa <[email protected]> Co-authored-by: Sebastian Poeplau <[email protected]>
Merge Upstream
Adding more functions
Remove extern block from RustRuntime.h
Change to C
* push * add * FMT * f * bits
Hey,
I came up with the code in this PR in order to be able to implement new SymCC/SymQEMU backends in Rust for my GSoC project (which is about integrating LibAFL with SymCC as a concolic engine for fuzzing).
In a nutshell, it implements a new backend in SymCC which is supposed to wrap a backend implemented in another language. This wrapper implements some utility functionality and garbage collection. How this is achieved is described in the README in
runtime/rust_backend/README
. The wrapped runtime in turn can implement a reduced interface of the 'original' runtime and can also keep the libc wrappers.I have put quite a bit of effort into making this code strictly additive to the current codebase of SymCC, which should make it easy to maintain. Also note, that while it may look like a lot of code, a lot of it is quite boilerplate-y and more or less a copy of the SimpleRuntime. I propose to merge this into SymCC purely for organisational reasons (ie. not having to sync the fork with SymCC upstream).
I think the code could benefit SymCC/SymQEMU and related research by making it much easier to build a Runtime that is not C++, but still keeps the libc wrappers and garbage collection feature from the common SymCC runtime code. Note, that interfacing with the GC feature of SymCC more or less requires interoperability with C++, making an implementation in foreign languages quite cumbersome. The common backend code also makes the assumption that the backend will always keep track of the bit-width of any expression, which proposed code also implements transparently for the wrapped runtime.
Cheers