Skip to content

Commit

Permalink
Added docs for symbol resolver
Browse files Browse the repository at this point in the history
  • Loading branch information
oxisto committed Sep 30, 2024
1 parent b87a85a commit 34fd4b8
Show file tree
Hide file tree
Showing 5 changed files with 74 additions and 10 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -171,7 +171,7 @@ open class SymbolResolver(ctx: TranslationContext) : ComponentPass(ctx) {
// identifier in Go)
if (
language is HasAnonymousIdentifier &&
current.name.localName == language.anonymousIdentifier
current.name.localName == language.anonymousIdentifier
) {
return
}
Expand All @@ -180,6 +180,11 @@ open class SymbolResolver(ctx: TranslationContext) : ComponentPass(ctx) {
// resolution, but in future this will also be used in resolving regular references.
current.candidates = scopeManager.findSymbols(current.name, current.location).toSet()

// Preparation for a future without legacy call resolving. Taking the first candidate is not
// ideal since we are running into an issue with function pointers here (see workaround
// below).
var wouldResolveTo = current.candidates.singleOrNull()

// For now, we need to ignore reference expressions that are directly embedded into call
// expressions, because they are the "callee" property. In the future, we will use this
// property to actually resolve the function call. However, there is a special case that
Expand All @@ -189,21 +194,38 @@ open class SymbolResolver(ctx: TranslationContext) : ComponentPass(ctx) {
// of this call expression back to its original variable declaration. In the future, we want
// to extend this particular code to resolve all callee references to their declarations,
// i.e., their function definitions and get rid of the separate CallResolver.
var wouldResolveTo: Declaration? = null
if (current.resolutionHelper is CallExpression) {
// Peek into the declaration, and if it is only one declaration and a variable, we can
// proceed normally, as we are running into the special case explained above. Otherwise,
// we abort here (for now).
wouldResolveTo = current.candidates.singleOrNull()
if (wouldResolveTo !is VariableDeclaration && wouldResolveTo !is ParameterDeclaration) {
return
}
}

// Some stupid C++ workaround to use the legacy call resolver when we try to resolve targets
// for function pointers. At least we are only invoking the legacy resolver for a very small
// percentage of references now.
if (wouldResolveTo is FunctionDeclaration) {
// We need to invoke the legacy resolver, just to be sure
var legacy = scopeManager.resolveReference(current)

// This is just for us to catch these differences in symbol resolving in the future. The
// difference is pretty much only that the legacy system takes parameters of the
// function-pointer-type into account and the new system does not (yet), because it just
// takes the first match. This will be needed to solve in the future.
if (legacy != wouldResolveTo) {
log.warn(
"The legacy symbol resolution and the new system produced different results here. This needs to be investigated in the future. For now, we take the legacy result."
)
wouldResolveTo = legacy
}
}

// Only consider resolving, if the language frontend did not specify a resolution. If we
// already have populated the wouldResolveTo variable, we can re-use this instead of
// resolving again
var refersTo = current.refersTo ?: wouldResolveTo ?: scopeManager.resolveReference(current)
var refersTo = current.refersTo ?: wouldResolveTo

var recordDeclType: Type? = null
if (currentClass != null) {
Expand All @@ -218,9 +240,9 @@ open class SymbolResolver(ctx: TranslationContext) : ComponentPass(ctx) {
// only add new nodes for non-static unknown
if (
refersTo == null &&
!current.isStaticAccess &&
recordDeclType != null &&
recordDeclType.recordDeclaration != null
!current.isStaticAccess &&
recordDeclType != null &&
recordDeclType.recordDeclaration != null
) {
// Maybe we are referring to a field instead of a local var
val field = resolveMember(recordDeclType, current)
Expand Down
1 change: 1 addition & 0 deletions docs/docs/CPG/impl/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,3 +24,4 @@ the graph. These two stages are strictly separated one from each other.
* [Languages and Language Frontends](./language)
* [Scopes](./scopes)
* [Passes](./passes)
* [Symbol Resolution](./symbol-resolver.md)
6 changes: 4 additions & 2 deletions docs/docs/CPG/impl/scopes.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Implementation and Concepts - Scopes"
linkTitle: "Implementation and Concepts - Scopes"
title: "Implementation and Concepts - Scopes and Symbols"
linkTitle: "Implementation and Concepts - Scopes and Symbols"
weight: 20
no_list: false
menu:
Expand Down Expand Up @@ -102,3 +102,5 @@ var name = parseName("std::string")
// This will return all the 'string' symbols within the 'std' name scope
var stringSymbols = scopeManager.findSymbols(name)
```

Developers should avoid symbol lookup during frontend parsing, since often during parsing, only a limited view of all symbols is available. Instead, a dedicated pass that is run on the complete translation result is the preferred option. Apart from that, the main usage of this API is in the [SymbolResolver](symbol-resolver.md).
38 changes: 38 additions & 0 deletions docs/docs/CPG/impl/symbol-resolver.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
---
title: "Implementation and Concepts - Symbol Resolution"
linkTitle: "Implementation and Concepts - Symbol Resolution"
weight: 20
no_list: false
menu:
main:
weight: 20
description: >
The CPG library is a language-agnostic graph representation of source code.
---


# Symbol Resolution

This pages describes the main functionality behind symbol resolution in the CPG library. This is mostly done by the `SymbolResolver` pass, in combination with the symbol lookup API (see [Scopes and Symbols](scopes.md#looking-up-symbols)). In addition to the *lookup* of a symbol, the *resolution* takes the input of the lookup and provides a "definite" decision which symbol is used. This mostly referred to symbols / names used in a `Reference` or a `CallExpression` (which also has a reference as its `CallExpression::callee`).

## The `SymbolResolver` Pass

The `SymbolResolver` pass takes care of the heavy lifting of symbol (or rather reference) resolving:

* It sets the `Reference::refersTo` property,
* and sets the `CallExpression::invokes` property,
* and finally takes cares of operator overloading (if the language supports it).

In a way, it can be compared to a linker step in a compiler. The pass operates on a single `Component` and starts by identifying EOG starter nodes within the component. These node "start" an EOG sub-graph, i.e., they do not have any previous EOG edges. The symbol resolver uses the `ScopedWalker` with a special set-up that traverses the EOG starting with each EOG starter node until it reaches the end. This ensures that symbols are resolved in the correct order of "evaluation", e.g., that a base of a member expression is resolved before the expression itself. This ensures that necessary type information on the base are available in order to resolve appropriate fields of the member expression.

The symbol resolver itself has gone through many re-writes over the years and there is still some code left that we consider *legacy*. These functions are marked as such, and we aim to remove them slowly.

## Resolving References

The main functionality lies in `ScopeManager::handleReference`. For all `Reference` nodes (that are not `MemberExpression` nodes) we use the symbol lookup API to find declaration candidates for the name the reference is referring to. This candidate list is then stored in `Reference::candidates`. If the reference is the `CallExpression::callee` property of a call, we abort here and jump to [Resolve Calls](#resolve-calls).

Otherwise, we currently take the first entry of the candidate list and set the `Reference::refersTo` property to it.

## Resolve Calls

Prequisite: The `CallExpression::callee` reference must have been resolved (see [Resolving References](#resolving-references)).
3 changes: 2 additions & 1 deletion docs/mkdocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -167,8 +167,9 @@ nav:
- "Implementation":
- CPG/impl/index.md
- "Language Frontends": CPG/impl/language.md
- "Scopes": CPG/impl/scopes.md
- "Scopes and Symbols": CPG/impl/scopes.md
- "Passes": CPG/impl/passes.md
- "Symbol Resolution": CPG/impl/symbol-resolver.md
- "Contributing":
- "Contributing to the CPG library": Contributing/index.md
# This assumes that the most recent dokka build was generated with the "main" tag!
Expand Down

0 comments on commit 34fd4b8

Please sign in to comment.