-
Notifications
You must be signed in to change notification settings - Fork 6
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
dc32e99
commit 853adfd
Showing
2 changed files
with
124 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,120 @@ | ||
# Concepts | ||
|
||
## Task | ||
|
||
At the heart of the Intelligence Layer is a `Task`. A Task is actually a pretty generic concept that just | ||
transforms an input-parameter to an output like a function in mathematics. | ||
|
||
``` | ||
Task: Input -> Output | ||
``` | ||
|
||
In Python this is expressed through an abstract class with type-parameters and the abstract method `do_run` | ||
where the actual transformation is implemented: | ||
|
||
```Python | ||
class Task(ABC, Generic[Input, Output]): | ||
|
||
@abstractmethod | ||
def do_run(self, input: Input, task_span: TaskSpan) -> Output: | ||
... | ||
``` | ||
|
||
`Input` and `Output` are normal Python datatypes that can be serialized from and to JSON. For this the Intelligence | ||
Layer relies on [Pydantic](https://docs.pydantic.dev/). The types that can actually be used are defined in form | ||
of the type-alias [`PydanticSerializable`](src/intelligence_layer/core/tracer.py#L44). | ||
|
||
The second parameter `task_span` is used for [tracing](#Trace) which is described below. | ||
|
||
`do_run` is the method that needs to be implemented for a concrete Task. The external interface of a | ||
Task is its `run` method: | ||
|
||
```Python | ||
class Task(ABC, Generic[Input, Output]): | ||
@final | ||
def run(self, input: Input, tracer: Tracer, trace_id: Optional[str] = None) -> Output: | ||
... | ||
``` | ||
|
||
Its signature differs only in the parameters regarding [tracing](#Trace). | ||
|
||
### Levels of abstraction | ||
|
||
Even though the concept is so generic the main purpose for a Task is of course to make use of an LLM for the | ||
transformation. Tasks are defined at different levels of abstraction. There are higher level Tasks (also called Use Cases) | ||
that reflect a typical user problem and there are lower level Tasks that are more about interfacing | ||
with an LLM on a very generic or even technical level. | ||
|
||
Examples for higher level tasks (Use Cases) are: | ||
|
||
- Answering a question based on a gievn document: `QA: (Document, Question) -> Answer` | ||
- Generate a summary of a given document: `Summary: Document -> Summary` | ||
|
||
Examples for lower level tasks are: | ||
|
||
- Let the model generate text based on an instruacton and some context: `Instruct: (Context, Instruction) -> Completion` | ||
- Chunk a text in smaller pieces at optimized boundaries (typically to make it fit into an LLM's context-size): `Chunk: Text -> [Chunk]` | ||
|
||
### Composability | ||
|
||
Tasks compose. Typically you would build higher level tasks from lower level tasks. Given a task you can draw a dependency graph | ||
that illustrates which sub-tasks it is using and in turn which sub-tasks they are using. This graph typically forms a hierarchy or | ||
more general a directed acyclic graph. The following drawing shows this graph for the Intelligence Layer's `RecursiveSummarize` | ||
Task: | ||
|
||
<img src="./assets/RecursiveSummary.drawio.svg"> | ||
|
||
|
||
### Trace | ||
|
||
A Task implements a workflow. It processes its input, passes it on to sub-tasks, processes the outputs of sub-tasks | ||
to build its own output. This workflow can be represented in a trace. For this a Task's `run` method takes a `Tracer` | ||
that takes care of storing details on the steps of this workflow like the tasks that have been invoked along with their | ||
input and output and timing information. For this the tracing defines the following concepts: | ||
|
||
- A `Tracer` is passed to a Task's `run` method and provides methods for opening `Span`s or `TaskSpan`s. | ||
- A `Span` allows for grouping multiple logs and duration together as a single, logical step in the | ||
workflow. | ||
- A `TaskSpan` allows for grouping multiple logs together, as well as the task's specific input, output, | ||
and duration. | ||
|
||
Each of these concepts is implemented in form of an abstract base class and the Intelligence Layer provides | ||
several implementations: | ||
|
||
- The `NoOpTracer` can be used when tracing information shall not be stored at all. | ||
|
||
## Evaluation | ||
|
||
### Dataset | ||
|
||
- List of examples (`Input`) | ||
|
||
### Run | ||
|
||
- Compute `Output`s for Dataset | ||
|
||
### Evaluate | ||
|
||
- Evaluate a single run to create an results that can be compared | ||
- Compare multiple runs with a single evaluation (e.g. ELO) | ||
|
||
### Aggregate | ||
|
||
- Aggregate results from a single evaluation | ||
- Aggregate results from multiple compare-evaluations to complete comparison | ||
|
||
### Data Storage | ||
|
||
- DatasetRepository | ||
- RunRepository | ||
- EvaluationRepository | ||
- AggregationRepository | ||
|
||
|
||
explainability: | ||
- debug loglevel explain (full prompt vs focus (RAG)) (prompt whisper) | ||
- eval: unexpected result: explain for input (aggregate) | ||
- run explain only on "failed" | ||
|
||
Run: | ||
- scheduled |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.