Skip to content

Commit

Permalink
Merge pull request #955 from soot-oss/doc/how2startATool
Browse files Browse the repository at this point in the history
Improve Documentation website
  • Loading branch information
stschott authored Jun 19, 2024
2 parents 1b7f9fd + fbe539b commit fde1381
Show file tree
Hide file tree
Showing 41 changed files with 3,078 additions and 2,510 deletions.
10 changes: 3 additions & 7 deletions .github/workflows/gh-pages.yml
Original file line number Diff line number Diff line change
Expand Up @@ -51,11 +51,7 @@ jobs:
python-version: 3.x

# install dependencies
- run: pip install mike
- run: pip install mkdocs-material
- run: pip install mkdocs-tooltips
- run: pip install git+https://github.com/RedisLabs/mkdocs-include.git
- run: pip install git+https://github.com/swissiety/LspLexer4Pygments.git
- run: pip install mike mkdocs-material mkdocs-tooltips git+https://github.com/RedisLabs/mkdocs-include.git git+https://github.com/swissiety/LspLexer4Pygments.git
# grab latest release url of the JimpleLSP jar and download it
- run: curl -s -L -o ./jimplelsp.jar $(curl -s https://api.github.com/repos/swissiety/jimpleLsp/releases/latest | grep 'browser_download_url".*jar"' | cut -d ':' -f 2,3 | tr -d \")

Expand All @@ -68,8 +64,8 @@ jobs:
git config --local user.email "github-actions[bot]@users.noreply.github.com"
git config --local user.name "github-actions[bot]"
# sanitive head_ref name
- run: echo "DOC_VERSION_NAME=$(echo ${{ github.head_ref }} | sed "s/[^[:alnum:]-]/_/g" )" >> $GITHUB_ENV
# sanitize head_ref name
- run: echo "DOC_VERSION_NAME=$(echo ${{ github.head_ref }} | sed "s/[^([[:alnum:]_.-]/_/g" )" >> $GITHUB_ENV

# on push to develop branch - keep a doc around for develop to show the current state
- name: deploy doc in subdirectory
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ You want to collaborate? Please read our [coding guidelines and the contributors

## Publications
[the SootUp paper](https://doi.org/10.1007/978-3-031-57246-3_13) explains further details and the design decision behind SootUp.
[Preprint](/docs/SootUp-paper.pdf) is also available.
[Preprint](/docs/assets/SootUp-paper.pdf) is also available.

If you use SootUp in your research work, feel free to cite it as follows:

Expand Down
26 changes: 26 additions & 0 deletions docs/analysisinputlocations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Analysis Input
i.e. What should be analyzed - an `AnalysisInputLocation` points to code input SootUp can analyze.
We ship multiple Subclasses that can handle different code input.

### Java Runtime
- Java <=8: `DefaultRTJaAnalysisInputLocation` current rt.jar (or point to any rt.jar as its just a usual .jar file)
- Java >=9: `JRTFilesystemAnalysisInputLocation`

If you have errors like Java.lang.String, Java.lang.Object, ... you are most likely missing this AnalysisInput.

### Java Bytecode .class, .jar, .war
- `JavaClassPathAnalysisInputLocation` - its the equivalent of the classpath you would pass to the java executable i.e. point to root(s) of package(s).

### Java Sourcecode .java
- `OTFCompileAnalysisInputLocation` - you can point directly to .java files or pass a String with Java sourcecode, SootUp delegates to the `JavaCompiler` and transform the bytecode from the compiler to Jimple
- `JavaSourcePathInputLocation` [***experimental!***]{Has huge problems with exceptional flow!} - points to a directory that is the root source directory (containing the package directory structure).

### Jimple .jimple
- `JimpleAnalysisInputLocation` - needs a Path to a .jimple file or a directory.

### Android Bytecode .dex
- `ApkAnalysisInputLocation` - currenlty uses dex2jar internally - A SootUp solution to directly generate Jimple is WIP!


### Java cli arguments to configure SootUp
We created a [Utility](tool_setup.md) that parses a String of java command line arguments and configures SootUp respectively.
14 changes: 7 additions & 7 deletions docs/announce.md → docs/announcement.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,17 +12,17 @@ SootUp is a library that can easily be included in other projects, leaving those

Below is an overview of what’s new.

* Library by default, framework as an option
* Modular Architecture, no more singletons
* New source code frontend
* Immutable Jimple IR
* Greatly increased testability and test coverage
- Library by default, framework as an option
- Modular Architecture, no more singletons
- New source code frontend
- Immutable Jimple IR
- Greatly increased testability and test coverage ![Coverage](https://camo.githubusercontent.com/adc4ab244f7c0c2b2f3fec0a6e5d778421ddc0be7f89a608c16533c9a964766f/68747470733a2f2f636f6465636f762e696f2f67682f736f6f742d6f73732f536f6f7455702f6272616e63682f646576656c6f702f67726170682f62616467652e7376673f746f6b656e3d454c4137553749415744)

SootUp is not a drop-in replacement for Soot! Due to its completely new architecture and API it is essentially an almost complete rewrite. For a while, Soot and SootUp will coexist, as many existing tools depend on Soot, yet our maintenance efforts will henceforth be focused on SootUp, not Soot, and on extending SootUp with those capabilities that people still find missing. For now, we recommend using SootUp for greenfield projects.

For more details, check out
* The SootUp home page: https://soot-oss.github.io/SootUp/, and
* The SootUp repository: https://github.com/soot-oss/SootUp/

[This Page ;-)](https://soot-oss.github.io/SootUp/) and The SootUp repository: [https://github.com/soot-oss/SootUp/](https://soot-oss.github.io/SootUp/)

We are very much looking forward to your feedback and feature requests. To this end, best create appropriate issues in the repository.

Expand Down
File renamed without changes.
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
29 changes: 14 additions & 15 deletions docs/bodyinterceptors.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,32 +9,31 @@ The "raw" generated Jimple from the Bytecodefrontend needs a lot improvements -
- The Locals we get from the Java bytecode are typically untyped. Therefore we have to augment the Local types which is done by the TypeAssigner.
- t.b.c.

Method scoped optimisations:
Optimizations (method scope)

- ConditionalBranchFolder: removes tautologic ifs that are always true/false - if we can determine it in the scope of the method.
- EmptySwitchEliminator: removes switches that are not really switching
- ConstantPropagatorAndFolder: calculates constant values before runtime
- CastAndReturnInliner: Removes merging flows to a single return
- UnreachableCodeEliminator: speaks for itself.
- TrapTightener

Make Local names standardized:
Standardize Jimple appearance

- LocalNameStandardizer: numbers Locals with the scheme: type-initial + number of type occurence

!!! info "Soot Equivalent"

[BodyTransformer](https://github.com/soot-oss/soot/blob/develop/src/main/java/soot/BodyTransformer.java)


Below, we show how these BodyInterceptors work for the users who are interested in their internal workings.

### LocalSplitter

LocalSplitter is a<code>BodyInterceptor</code>that attempts to identify and separate uses of a local variable (as definition) that are independent of each other by renaming local variables.


Example 1:

![LocalSplitter Example_1](./figures/LocalSplitter%20Example_1.png)
![LocalSplitter Example_1](assets/figures/LocalSplitter%20Example_1.png)

As shown in the example above, the local variable<code>l1</code>is defined twice. It can be split up into two new local variables: <code>l1#1</code> and <code>l1#2</code> because the both definitions are independent of each other.

Expand All @@ -45,7 +44,7 @@ Look for foldable navigation and tabs for showing old vs new

Example 2:

![LocalSplitter Example_2](./figures/LocalSplitter%20Example_2.png)
![LocalSplitter Example_2](assets/figures/LocalSplitter%20Example_2.png)

In the second example, the local variable<code>l2</code>is defined thrice. But it cannot be split up into three new local variables as in the first example, because its definitions in the if-branches are not independent of each other. Therefore, it can only be split up into two local variables as shown in the figure.

Expand All @@ -57,7 +56,7 @@ LocalPacker is a<code>BodyInterceptor</code>that attempts to minimize the number

Example:

![LocalPacker Example](./figures/LocalPacker%20Example.png)
![LocalPacker Example](assets/figures/LocalPacker%20Example.png)

In the given example above, the local variables<code>l1</code>,<code>l3</code>are summarized to be one local variable<code>l1</code>, because they have the same type without interference with each other. Likewise, the local variables<code>l2</code>,<code>l4</code>and<code>l5</code>are summarized to be another local variable<code>l2</code>. Although the local variable<code>l0</code>doesn't interfere any other local variables, it cannot be summed up with other local variables because of its distinctive type.

Expand All @@ -70,7 +69,7 @@ TrapTightener is a<code>BodyInterceptor</code>that shrinks the protected area co

Example:

![TrapTightener Example](./figures/TrapTightener%20Example.png)
![TrapTightener Example](assets/figures/TrapTightener%20Example.png)

We assume in the example above that only the<code>Stmt</code>:<code>l2 := 2</code>might throw an exception caught by the<code>Trap</code>which is labeled with<code>label3</code>. In the jimple body before running the TrapTightener, the protected area covered by the Trap contains three<code>Stmts</code>:<code>l1 := 1; l2 := 2; l2 := 3</code>. But an exception could only arise at the<code>Stmt</code>:<code>l2 := 2</code>. After the implementation of TrapTightener, we will get a contractible protected area which contains only the<code>Stmt</code>that might throw an exception, namely the<code>Stmt</code>:<code>l2 := 2</code>.

Expand All @@ -82,7 +81,7 @@ EmptySwitchEliminator is a<code>BodyInterceptor</code>that removes empty switch

Example:

![EmptySwitchEliminator Example](./figures/EmptySwitchEliminator%20Example.png)
![EmptySwitchEliminator Example](assets/figures/EmptySwitchEliminator%20Example.png)

As shown in the example above, the switch statement in the jimple body always takes the default action. After running EmptySwitchEliminator, the switch statement is replaced with a<code>GotoStmt</code>to the default case.

Expand All @@ -94,7 +93,7 @@ UnreachableCodeEliminator is a<code>BodyInterceptor</code>that removes all unrea

Example:

![UnreachableCodeEliminator Example](./figures/UnreachableCodeEliminator%20Example.png)
![UnreachableCodeEliminator Example](assets/figures/UnreachableCodeEliminator%20Example.png)

Obviously, the code segment<code>l2 = 2; l3 = 3;</code>is unreachable. It will be removed after running the UreachableCodeEliminator.

Expand All @@ -106,7 +105,7 @@ CopyPropagator is a<code>BodyInterceptor</code>that supports the global copy pro

Example for global copy propagation:

![UnreachableCodeEliminator Example](./figures/CopyPropagator%20Example_1.png)
![UnreachableCodeEliminator Example](assets/figures/CopyPropagator%20Example_1.png)

Consider a code segment in the following form:

Expand All @@ -125,7 +124,7 @@ In the example for global copy propagation, the first used<code>l1</code>is repl

Example for constant propagation:

![CopyPropagator Example_1](figures/CopyPropagator%20Example_2.png)
![CopyPropagator Example_1](assets/figures/CopyPropagator%20Example_2.png)

Constant propagation is similar to copy propagation. Consider a code segment in the following form:

Expand Down Expand Up @@ -169,8 +168,8 @@ StaticSingleAssignmentFormer is a<code>BodyInterceptor</code>that transforms jim

Example:

![SSA Example_1](./figures/SSA%20Example_1.png)
![SSA Example_1](assets/figures/SSA%20Example_1.png)

![SSA Example_2](./figures/SSA%20Example_2.png)
![SSA Example_2](assets/figures/SSA%20Example_2.png)

In the given example, the StaticSingleAssignmentFormer assigns each<code>IdentityStmt</code>and<code>AssignStmt</code>to a new local variable . And each use uses the local variable which is most recently defined. Sometimes, it is impossible to determine the most recently defined local variable for a use in a join block. In this case, the StaticSingleAssignmentFormer will insert a<code>PhiStmt</code>in the front of the join block to merge all most recently defined local variables and assign them a new local variable.
11 changes: 6 additions & 5 deletions docs/advanced-topics.md → docs/builtin-analyses.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,20 @@
# Functionalities and Utilities
# BuiltIn Analyses
More to come!

#### LocalLivenessAnalyser
### LocalLivenessAnalyser

LocalLivenessAnalyser is used for querying for the list of live local variables before and after a given <code>Stmt</code>.

Example:

![LocalLiveness Example](./figures/LocalLiveness%20Example.png)
![LocalLiveness Example](assets/figures/LocalLiveness%20Example.png)

The live local variables before and after each <code>Stmt</code> will be calculated after generating an instance of LocalLivenessAnalyser as shown the example above. They can be queried by using the methods <code>getLiveLocalsBeforeStmt</code> and <code>getLiveLocalsAfterStmt</code>.

#### DominanceFinder
### DominanceFinder

DomianceFinder is used for querying for the immediate dominator and dominance frontiers for a given basic block.

Example: ![DominanceFinder Example](figures/DominanceFinder%20Example.png)
Example: ![DominanceFinder Example](assets/figures/DominanceFinder%20Example.png)

After generating an instance of DominanceFinder for a <code>BlockGraph</code>, we will get the immediate dominator and dominance frontiers for each basic block. The both properties can be queried by using the methods<code>getImmediateDominator</code>and<code>getDominanceFrontiers</code>.
53 changes: 32 additions & 21 deletions docs/call-graph-construction.md → docs/callgraphs.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,9 @@ Below, we show how to create a type hierarchy:
=== "SootUp"

```java
String cpString = "src/test/resources/Callgraph/binary";
List<AnalysisInputLocation> inputLocations = new ArrayList();
inputLocations.add(new JavaClassPathAnalysisInputLocation("src/test/resources/Callgraph/binary"));
inputLocations.add(new JavaClassPathAnalysisInputLocation(cpStr));
inputLocations.add(new DefaultRTJarAnalysisInputLocation());

JavaView view = new JavaView(inputLocations);
Expand Down Expand Up @@ -43,23 +44,33 @@ Below, we show how to create a type hierarchy:
## Defining an Entry Method
All the call graph construction algorithms require an entry method to start with. In java application, you usually define the main method. However, it is possible to define arbitrary entry methods depending on your needs. Below, we show how to define such an entry method:

=== "SootUp"
=== "SootUp (performant)"

```java
JavaClassType classTypeA = view.getIdentifierFactory().getClassType("A");
JavaClassType classTypeA = view.getIdentifierFactory().getClassType("packageNameA.A");

MethodSignature entryMethodSignature =
view.getIdentifierFactory()
.getMethodSignature(
classTypeA,
JavaIdentifierFactory.getInstance()
.getMethodSubSignature(
"calc", VoidType.getInstance(), Collections.singletonList(classTypeA)));
"calc",
VoidType.getInstance(),
Collections.singletonList(classTypeA)
);
```


=== "SootUp (alternative)"

```java
String methodSigStr = "<packageNameA.A: void calc(packageNameA.A)";
MethodSignature entryMethodSignature = view
.getIdentifierFactory().parseMethodSignature(methodSigStr));
```

=== "Soot"

```java
String targetTestClassName = "packageNameA.A";
SootMethod src = Scene.v().getSootClass(targetTestClassName).getMethodByName("doStuff");

```
Expand All @@ -71,13 +82,12 @@ You can construct a call graph with CHA as follows:
=== "SootUp"

```java
CallGraphAlgorithm cha =
new ClassHierarchyAnalysisAlgorithm(view);
CallGraphAlgorithm cha = new ClassHierarchyAnalysisAlgorithm(view);

CallGraph cg =
cha.initialize(Collections.singletonList(entryMethodSignature));

System.out.println(cg);
CallGraph cg = cha.initialize(Collections.singletonList(entryMethodSignature));

cg.callsFrom(entryMethodSignature).stream()
.forEach(tgt -> System.out.println(entryMethodSignature + " may call " + tgt);
```

=== "Soot"
Expand All @@ -90,7 +100,7 @@ You can construct a call graph with CHA as follows:
while (targets.hasNext()) {
SootMethod tgt = (SootMethod)targets.next();
System.out.println(src + " may call " + tgt);
}
}
```

## Rapid Type Analysis
Expand All @@ -100,13 +110,12 @@ You can construct a call graph with RTA as follows:
=== "SootUp"

```java
CallGraphAlgorithm rta =
new RapidTypeAnalysisAlgorithm(view);
CallGraphAlgorithm rta = new RapidTypeAnalysisAlgorithm(view);

CallGraph cg =
rta.initialize(Collections.singletonList(entryMethodSignature));
CallGraph cg = rta.initialize(Collections.singletonList(entryMethodSignature));

System.out.println(cg);
cg.callsFrom(entryMethodSignature).stream()
.forEach(tgt -> System.out.println(entryMethodSignature + " may call " + tgt);
```

=== "Soot"
Expand All @@ -128,12 +137,14 @@ You can construct a call graph with RTA as follows:
```

## Variable Type Analysis
(**WIP!**)

Variable Type Analysis (VTA) algorithm further refines the call graph that the RTA constructs. It refines RTA by considering only the assigned instantiations of the implementers of an interface, when resolving a method call on an interface.
When considering assignments, we usually need to consider **pointer** (points-to) relationship.

!!! info
!!! info "WIP"

VTA algorithm was implemented using the [Spark](https://plg.uwaterloo.ca/~olhotak/pubs/thesis-olhotak-msc.pdf) pointer analysis framework.
VTA algorithm will be implemented using the [Spark](https://plg.uwaterloo.ca/~olhotak/pubs/thesis-olhotak-msc.pdf) pointer analysis framework.
A reimplementation of Spark in SootUp is currently under development.

Spark requires an initial call graph to begin with. You can use one of the call graphs that we have constructed above. You can construct a call graph with VTA as follows:
Expand Down
Loading

0 comments on commit fde1381

Please sign in to comment.