-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Towards better inference: bits → nibbles #3808
Draft
originalsouth
wants to merge
81
commits into
main
Choose a base branch
from
feature/nibbles
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
81 commits
Select commit
Hold shift + click to select a range
ae00a8e
Introducing nibbles
originalsouth c90fcb0
Prototyping
originalsouth d57cf19
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth 64ece62
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth bba22a3
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth 0896eba
set default in model
noamblitz 964b89b
remove default bit
noamblitz 5915f03
fix test
noamblitz ed7be58
Fix Octopoes tests for patch related changes
originalsouth efa3c97
Merge branch 'set-default-risk-in-model' of github.com:minvws/nl-kat-…
originalsouth 663a9bb
Fix Octopoes tests for patch related changes II
originalsouth bd78ed9
Merge branch 'main' into set-default-risk-in-model
originalsouth b5ba90a
Fix Octopoes tests for patch related changes III
originalsouth f885652
Merge branch 'set-default-risk-in-model' of github.com:minvws/nl-kat-…
originalsouth b05283e
Prevent race conditions between Octopoes' event manager and the sched…
originalsouth 06d1080
Merge branch 'main' into set-default-risk-in-model
underdarknl 5bf8b35
Merge branch 'main' into set-default-risk-in-model
originalsouth 967d41b
Merge branch 'main' into set-default-risk-in-model
underdarknl d30b33f
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth 86fe7d5
Merge branch 'fix/prevent_race_conditions_between_event_manager_and_s…
originalsouth dca2b20
Merge branch 'set-default-risk-in-model' into feature/nibbles
originalsouth 7699d93
Fixes for idle run
originalsouth 0eb106f
Merge branch 'main' into feature/nibbles
originalsouth 2ed89fb
Manual merge
originalsouth d9c9fa2
Revert "Set default findingtype risk in model instead of in bit (#3562)"
originalsouth 20c5abf
Pre-commit after revert
originalsouth 2d09141
Remove bogus rlu_cache
originalsouth 6adeffe
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth f3f4277
Register origins and add parameters begins
originalsouth ef9ad80
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth 5546cd8
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth cf2f04c
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth 6fd5f74
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth 1b49c3b
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth b28ae84
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth 8b0f50d
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth f140e87
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth be03bf8
Add blocklist and ooi reuse to inference
originalsouth 852ec3e
Fix runner
originalsouth ed4c40a
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth df9a329
Basic nibbler
originalsouth 5908b42
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth 2de975d
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth a67b297
Add more boilerplating
originalsouth f20cb4b
Check clearance for seed OOI in nibbles
originalsouth d706b35
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth 49e1116
Add unit test
originalsouth 8ff6fac
Add unit test
originalsouth a9da549
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth 6fbcf12
Make SonarClaus Happier
originalsouth bd7b82d
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth 13400b3
More testing and fixing
originalsouth 137b687
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth aa66104
Moves towards a new niddles
originalsouth 4d9baa2
Purge NMAX
originalsouth 63cdaec
Another day another design
originalsouth f337ee3
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth a18929b
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth e35b101
Add multivariable support
originalsouth 0c8a6bb
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth d320be2
Refactor
originalsouth 87909ae
Fix typing
originalsouth 4b853d9
Refactor
originalsouth d084a38
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth 5266ccd
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth e7b3a5a
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth bd59705
Mostly fix nibble-origins -> nibblettes
originalsouth e9a4576
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth 8c6d6e5
Add comment
originalsouth 9890402
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth ac80ae0
Give me the $$$ AWK input
originalsouth d978519
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth 6272afc
Faster serialization
originalsouth 82b6ad4
Skip encoding
originalsouth dee4a4a
Revert "Faster serialization"
originalsouth c40537d
nibblette -> nibblet
originalsouth 9e0a0ca
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth d19812d
Test re-evaluation
originalsouth 5e5ff0b
Merge remote-tracking branch 'origin/main' into feature/nibbles
originalsouth cc73cf0
Fix double dict entry "bug"
originalsouth 986b32d
Run all nibbles not touched by nibblets
originalsouth File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,96 @@ | ||
import importlib | ||
import pkgutil | ||
from collections.abc import Iterable | ||
from pathlib import Path | ||
from types import MethodType, ModuleType | ||
|
||
import structlog | ||
from pydantic import BaseModel | ||
|
||
from octopoes.models import OOI | ||
|
||
NIBBLES_DIR = Path(__file__).parent | ||
NIBBLE_ATTR_NAME = "NIBBLE" | ||
NIBBLE_FUNC_NAME = "nibble" | ||
logger = structlog.get_logger(__name__) | ||
|
||
|
||
class NibbleParameter(BaseModel): | ||
object_type: type | ||
parser: str = "[]" | ||
|
||
def __eq__(self, other): | ||
if isinstance(other, NibbleParameter): | ||
return vars(self) == vars(other) | ||
elif isinstance(other, type): | ||
return self.object_type == other | ||
else: | ||
return False | ||
|
||
|
||
class NibbleDefinition: | ||
id: str | ||
signature: list[NibbleParameter] | ||
query: str | None = None | ||
min_scan_level: int = 1 | ||
default_enabled: bool = True | ||
config_ooi_relation_path: str | None = None | ||
payload: MethodType | None = None | ||
|
||
def __init__( | ||
self, | ||
name: str, | ||
signature: list[NibbleParameter], | ||
query: str | None = None, | ||
min_scan_level: int = 1, | ||
default_enabled: bool = True, | ||
config_ooi_relation_path: str | None = None, | ||
): | ||
self.id = name | ||
self.signature = signature | ||
self.query = query | ||
self.min_scan_level = min_scan_level | ||
self.default_enabled = default_enabled | ||
self.config_ooi_relation_path = config_ooi_relation_path | ||
|
||
def __call__(self, args: Iterable[OOI]) -> OOI | Iterable[OOI | None] | None: | ||
if self.payload is None: | ||
raise NotImplementedError | ||
else: | ||
return self.payload(*args) | ||
|
||
|
||
def get_nibble_definitions() -> dict[str, NibbleDefinition]: | ||
nibble_definitions = {} | ||
|
||
for package in pkgutil.walk_packages([str(NIBBLES_DIR)]): | ||
if package.name in ["definitions", "runner"]: | ||
continue | ||
|
||
try: | ||
module: ModuleType = importlib.import_module(".nibble", f"{NIBBLES_DIR.name}.{package.name}") | ||
|
||
if hasattr(module, NIBBLE_ATTR_NAME): | ||
nibble_definition: NibbleDefinition = getattr(module, NIBBLE_ATTR_NAME) | ||
|
||
try: | ||
payload: ModuleType = importlib.import_module( | ||
f".{package.name}", f"{NIBBLES_DIR.name}.{package.name}" | ||
) | ||
if hasattr(payload, NIBBLE_FUNC_NAME): | ||
nibble_definition.payload = getattr(payload, NIBBLE_FUNC_NAME) | ||
else: | ||
logger.warning('module "%s" has no function %s', package.name, NIBBLE_FUNC_NAME) | ||
|
||
except ModuleNotFoundError: | ||
logger.warning('package "%s" has no function nibble', package.name) | ||
|
||
nibble_definitions[nibble_definition.id] = nibble_definition | ||
|
||
else: | ||
logger.warning('module "%s" has no attribute %s', package.name, NIBBLE_ATTR_NAME) | ||
|
||
except ModuleNotFoundError: | ||
logger.warning('package "%s" has no module nibble', package.name) | ||
|
||
return nibble_definitions |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,138 @@ | ||
import json | ||
from collections.abc import Callable, Iterable | ||
from datetime import datetime | ||
from typing import TypeVar | ||
|
||
from xxhash import xxh3_128_hexdigest as xxh3 # INFO: xxh3_64_hexdigest is faster but hash more collision probabilities | ||
|
||
from nibbles.definitions import NibbleDefinition, get_nibble_definitions | ||
from octopoes.models import OOI | ||
from octopoes.models.origin import Origin, OriginType | ||
from octopoes.models.types import type_by_name | ||
from octopoes.repositories.ooi_repository import OOIRepository | ||
from octopoes.repositories.origin_repository import OriginRepository | ||
from octopoes.repositories.scan_profile_repository import ScanProfileRepository | ||
|
||
T = TypeVar("T") | ||
U = TypeVar("U") | ||
|
||
|
||
def ooi_type(ooi: OOI) -> type[OOI]: | ||
return type_by_name(ooi.get_ooi_type()) | ||
|
||
|
||
def merge_with(func: Callable[[set[T], set[T]], set[T]], d1: dict[U, set[T]], d2: dict[U, set[T]]) -> dict[U, set[T]]: | ||
return {k: func(d1.get(k, set()), d2.get(k, set())) for k in set(d1) | set(d2)} | ||
|
||
|
||
def flatten(items: Iterable[OOI | Iterable[OOI | None] | None]) -> Iterable[OOI]: | ||
for item in items: | ||
if isinstance(item, OOI): | ||
yield item | ||
elif item is None: | ||
continue | ||
else: | ||
yield from flatten(item) | ||
|
||
|
||
def nibble_hasher(data: Iterable) -> str: | ||
return xxh3( | ||
"".join( | ||
[ | ||
json.dumps(json.loads(ooi.model_dump_json()), sort_keys=True) | ||
if isinstance(ooi, OOI) | ||
else json.dumps(ooi, sort_keys=True) | ||
for ooi in data | ||
] | ||
) | ||
) | ||
|
||
|
||
class NibblesRunner: | ||
def __init__( | ||
self, | ||
ooi_repository: OOIRepository, | ||
origin_repository: OriginRepository, | ||
scan_profile_repository: ScanProfileRepository, | ||
perform_writes: bool = True, | ||
): | ||
self.ooi_repository = ooi_repository | ||
self.origin_repository = origin_repository | ||
self.scan_profile_repository = scan_profile_repository | ||
self.perform_writes = perform_writes | ||
self.update_nibbles() | ||
|
||
def update_nibbles(self): | ||
self.nibbles: dict[str, NibbleDefinition] = get_nibble_definitions() | ||
|
||
def _run(self, ooi: OOI, valid_time: datetime) -> dict[str, dict[tuple, set[OOI]]]: | ||
return_value: dict[str, dict[tuple, set[OOI]]] = {} | ||
nibblets = self.origin_repository.list_origins( | ||
valid_time, origin_type=OriginType.NIBBLET, parameters_references=[ooi.reference] | ||
) | ||
if nibblets: | ||
for nibblet in nibblets: | ||
# INFO: we do not strictly need this if statement because OriginType.NIBBLETS \ | ||
# always have parameters_references but it makes the linters super happy | ||
if nibblet.parameters_references: | ||
nibble = self.nibbles[nibblet.method] | ||
args = self.ooi_repository.nibble_query( | ||
ooi, | ||
nibble, | ||
valid_time, | ||
nibblet.parameters_references | ||
if nibble.query is not None and nibble.query.count("$") > 0 | ||
else [], | ||
) | ||
results = { | ||
tuple(arg): set(flatten([nibble(arg)])) | ||
for arg in args | ||
if nibblet.parameters_hash != nibble_hasher(arg) | ||
} | ||
return_value |= {nibble.id: results} | ||
nibblet_nibbles = {self.nibbles[nibblet.method] for nibblet in nibblets} | ||
for nibble in filter(lambda x: type(ooi) in x.signature and x not in nibblet_nibbles, self.nibbles.values()): | ||
args = self.ooi_repository.nibble_query(ooi, nibble, valid_time) | ||
results = {tuple(arg): set(flatten([nibble(arg)])) for arg in args} | ||
return_value |= {nibble.id: results} | ||
# TODO: we could cache the writes for single OOI nibbles | ||
self._write({ooi: return_value}, valid_time) | ||
return return_value | ||
|
||
def _cleared(self, ooi: OOI, valid_time: datetime) -> bool: | ||
ooi_level = self.scan_profile_repository.get(ooi.reference, valid_time).level.value | ||
target_nibbles = filter(lambda x: type(ooi) in x.signature, self.nibbles.values()) | ||
return any(nibble.min_scan_level < ooi_level for nibble in target_nibbles) | ||
|
||
def _write(self, inferences: dict[OOI, dict[str, dict[tuple, set[OOI]]]], valid_time: datetime): | ||
if self.perform_writes: | ||
for source_ooi, results in inferences.items(): | ||
self.ooi_repository.save(source_ooi, valid_time) | ||
for nibble_id, run_result in results.items(): | ||
for arg, result in run_result.items(): | ||
nibble_origin = Origin( | ||
method=nibble_id, | ||
origin_type=OriginType.NIBBLET, | ||
source=source_ooi.reference, | ||
result=[ooi.reference for ooi in result], | ||
parameters_hash=nibble_hasher(arg), | ||
# TODO: What to do if a is not an OOI? | ||
parameters_references=[a.reference for a in arg if isinstance(a, OOI)], | ||
) | ||
for ooi in result: | ||
self.ooi_repository.save(ooi, valid_time=valid_time) | ||
self.origin_repository.save(nibble_origin, valid_time=valid_time) | ||
|
||
def infer(self, stack: list[OOI], valid_time: datetime) -> dict[OOI, dict[str, dict[tuple, set[OOI]]]]: | ||
inferences: dict[OOI, dict[str, dict[tuple, set[OOI]]]] = {} | ||
blockset = set(stack) | ||
if stack and self._cleared(stack[-1], valid_time): | ||
while stack: | ||
ooi = stack.pop() | ||
results = self._run(ooi, valid_time) | ||
if results: | ||
blocks = set.union(set(), *[ooiset for result in results.values() for _, ooiset in result.items()]) | ||
stack += [o for o in blocks if o not in blockset] | ||
blockset |= blocks | ||
inferences |= {ooi: results} | ||
return inferences |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering, what is the reason this isn't implemented as an e.g. Pydantic class but instead as a POJO-like class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somehow the Pydantic class does not work well with the importlib yielding the payload... not sure why but it fixed the issues so I moved on -- perhaps hoping one day you would fix it ;)