Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: workflows refactoring #2777

Closed
wants to merge 77 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
77 commits
Select commit Hold shift + click to select a range
3eab05a
fix: ability to raise alerts from workflows
Matvey-Kuk Dec 9, 2024
4d66140
fix: opsgenie notify to return response
talboren Dec 8, 2024
9130fe2
fix: fix
talboren Dec 8, 2024
8ff57ed
feat: wip
talboren Dec 8, 2024
22aadc8
feat: wip
talboren Dec 8, 2024
bff0e50
fix: alert hash
talboren Dec 8, 2024
615323e
fix: fix
talboren Dec 8, 2024
11faae0
fix: KEEP_DEBUG_TASKS
talboren Dec 8, 2024
187228b
fix: tests
talboren Dec 8, 2024
c5a3f60
fix: tests
talboren Dec 8, 2024
6aee01a
fix: tests
talboren Dec 8, 2024
bb3970f
fix: fix
talboren Dec 8, 2024
edfe0f8
fix: tests
talboren Dec 9, 2024
90ed82a
fix: tests
talboren Dec 9, 2024
f826891
fix: fix
talboren Dec 9, 2024
00d9607
fix: fix
talboren Dec 9, 2024
615f4fd
fix: some more
talboren Dec 10, 2024
c4f7b87
fix: fix
talboren Dec 10, 2024
b470255
fix: psycopg
talboren Dec 10, 2024
6f8ed56
Merge branch 'main' into feat/observability
talboren Dec 10, 2024
adbf86e
fix: update sqlalchemy
talboren Dec 10, 2024
983e5fb
fix: sqlalchemy again
talboren Dec 10, 2024
a4d4586
fix: psycopg
talboren Dec 10, 2024
dd9dac1
fix: postgres
talboren Dec 10, 2024
478b959
Merge branch 'main' into feat/observability
talboren Dec 10, 2024
e171100
fix: fix
talboren Dec 10, 2024
263812e
fix: fix
talboren Dec 10, 2024
bc6329a
fix: fix
talboren Dec 10, 2024
36131e3
fix: prom
talboren Dec 11, 2024
3fcde8b
fix: fix
talboren Dec 11, 2024
f988949
Merge branch 'main' into feat/observability
talboren Dec 11, 2024
8f42b69
fix: fix
talboren Dec 11, 2024
cf924c3
fix: fix
talboren Dec 11, 2024
e7c878a
fix: config
talboren Dec 11, 2024
5165421
fix: fix
talboren Dec 11, 2024
d9e42ac
fix: tests
talboren Dec 11, 2024
30b86b4
Merge branch 'main' into feat/observability
talboren Dec 11, 2024
c5bf342
fix: fix
talboren Dec 11, 2024
36a9e93
fix: fix
talboren Dec 11, 2024
c8b8836
fix: fix
talboren Dec 11, 2024
fca8531
async workflows
Matvey-Kuk Dec 11, 2024
bf98067
Merge branch 'main' into Matvey-Kuk/workflows-fix
Matvey-Kuk Dec 11, 2024
bc33a27
Merge branch 'main' into Matvey-Kuk/workflows-fix
Matvey-Kuk Dec 11, 2024
e1d4c5e
Merge remote-tracking branch 'refs/remotes/origin/Matvey-Kuk/workflow…
Matvey-Kuk Dec 11, 2024
23ac3e5
Merge branch 'feat/observability' into Matvey-Kuk/workflows-fix
Matvey-Kuk Dec 11, 2024
f75e93f
fix: fix
talboren Dec 11, 2024
e9441e9
fix: fix
talboren Dec 11, 2024
43af876
fix: fix
talboren Dec 11, 2024
1fa977c
fix: fix
talboren Dec 11, 2024
c329732
fix: try
talboren Dec 11, 2024
05b790f
fix: fix
talboren Dec 11, 2024
a74ac28
fix: fix
talboren Dec 11, 2024
78814e2
fix: fix
talboren Dec 11, 2024
4854948
fix: fix
talboren Dec 11, 2024
dfdd99c
fix: fix
talboren Dec 11, 2024
9867970
Merge branch 'main' into feat/observability
talboren Dec 11, 2024
a8bec64
fix: fix
talboren Dec 11, 2024
8273d16
fix: fix
talboren Dec 12, 2024
efadcb1
fix: more metrics
talboren Dec 12, 2024
492b55e
fix: fix
talboren Dec 12, 2024
58421da
fix: fix
talboren Dec 12, 2024
52e1e85
fix: tests
talboren Dec 12, 2024
6a7cf83
fix: fix
talboren Dec 12, 2024
29a0d54
fix: fix
talboren Dec 12, 2024
5fc9403
fix: dedup
talboren Dec 12, 2024
bd4948c
Merge branch 'main' into feat/observability
talboren Dec 12, 2024
99fce73
fix: leftover
talboren Dec 12, 2024
142a705
fix: remove nested
talboren Dec 12, 2024
31ee8cd
fix: fix
talboren Dec 12, 2024
f54152b
fix: fix
talboren Dec 12, 2024
05863da
WIP still
Matvey-Kuk Dec 12, 2024
278c245
Merge remote-tracking branch 'origin/feat/observability' into Matvey-…
Matvey-Kuk Dec 14, 2024
d5ec2f6
Next batch of async wf stuff
Matvey-Kuk Dec 16, 2024
2b918f6
more..
Matvey-Kuk Dec 16, 2024
8280fb5
More async
Matvey-Kuk Dec 16, 2024
c3c77ed
Reduce YAML parsing
Matvey-Kuk Dec 16, 2024
35a0218
Duplicate clickhouse provider
Matvey-Kuk Dec 22, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions .github/workflows/test-pr-e2e.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@ on:
workflow_dispatch:
pull_request:
paths:
- 'keep/**'
- 'keep-ui/**'
- 'tests/**'
- "keep/**"
- "keep-ui/**"
- "tests/**"

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref }}
Expand Down Expand Up @@ -123,7 +123,7 @@ jobs:

# create the state directory
# mkdir -p ./state && chown -R root:root ./state && chmod -R 777 ./state

- name: Run e2e tests and report coverage
run: |
poetry run coverage run --branch -m pytest -s tests/e2e_tests/
Expand All @@ -147,9 +147,9 @@ jobs:

- name: Upload test artifacts on failure
if: always()
uses: actions/upload-artifact@v3
uses: actions/upload-artifact@v4.4.3
with:
name: test-artifacts
name: test-artifacts-my-artifacts-${{ matrix.db_type }}
path: |
playwright_dump_*.html
playwright_dump_*.png
Expand Down
6 changes: 4 additions & 2 deletions .github/workflows/test-pr.yml
Original file line number Diff line number Diff line change
Expand Up @@ -91,9 +91,11 @@ jobs:
run: poetry install --no-interaction --no-root --with dev

- name: Run unit tests and report coverage
env:
LOG_LEVEL: DEBUG
SQLALCHEMY_WARN_20: 1
run: |
poetry run coverage run --branch -m pytest -n auto --non-integration --ignore=tests/e2e_tests/

poetry run coverage run --branch -m pytest --timeout 10 -n auto --non-integration --ignore=tests/e2e_tests/

- name: Run integration tests and report coverage
run: |
Expand Down
2 changes: 1 addition & 1 deletion docker/Dockerfile.api
Original file line number Diff line number Diff line change
Expand Up @@ -42,4 +42,4 @@ USER keep

ENTRYPOINT ["/venv/lib/python3.11/site-packages/keep/entrypoint.sh"]

CMD ["gunicorn", "keep.api.api:get_app", "--bind" , "0.0.0.0:8080" , "--workers", "4" , "-k" , "uvicorn.workers.UvicornWorker", "-c", "/venv/lib/python3.11/site-packages/keep/api/config.py", "--preload"]
CMD ["gunicorn", "keep.api.api:get_app", "--bind" , "0.0.0.0:8080" , "--workers", "4" , "-k" , "keep.api.custom_worker.CustomUvicornWorker", "-c", "/venv/lib/python3.11/site-packages/keep/api/config.py"]
111 changes: 69 additions & 42 deletions keep/api/alert_deduplicator/alert_deduplicator.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
get_alerts_fields,
get_all_deduplication_rules,
get_all_deduplication_stats,
get_custom_deduplication_rules,
get_custom_deduplication_rule,
get_last_alert_hash_by_fingerprint,
update_deduplication_rule,
)
Expand All @@ -31,12 +31,16 @@

class AlertDeduplicator:

DEDUPLICATION_DISTRIBUTION_ENABLED = config(
"KEEP_DEDUPLICATION_DISTRIBUTION_ENABLED", cast=bool, default=True
)
CUSTOM_DEDUPLICATION_DISTRIBUTION_ENABLED = config(
"KEEP_CUSTOM_DEDUPLICATION_ENABLED", cast=bool, default=True
)

def __init__(self, tenant_id):
self.logger = logging.getLogger(__name__)
self.tenant_id = tenant_id
self.provider_distribution_enabled = config(
"PROVIDER_DISTRIBUTION_ENABLED", cast=bool, default=True
)
self.search_engine = SearchEngine(self.tenant_id)

def _apply_deduplication_rule(
Expand Down Expand Up @@ -91,13 +95,23 @@ def _apply_deduplication_rule(
},
)
alert.isPartialDuplicate = True
else:
self.logger.info(
"Alert is not deduplicated",
extra={
"alert_id": alert.id,
"fingerprint": alert.fingerprint,
"tenant_id": self.tenant_id,
"last_alert_hash_by_fingerprint": last_alert_hash_by_fingerprint,
},
)

return alert

def apply_deduplication(self, alert: AlertDto) -> bool:
# IMPOTRANT NOTE TO SOMEONE WORKING ON THIS CODE:
# apply_deduplication runs AFTER _format_alert, so you can assume that alert fields are in the expected format.
# you can also safe to assume that alert.fingerprint is set by the provider itself
# you are also safe to assume that alert.fingerprint is set by the provider itself

# get only relevant rules
rules = self.get_deduplication_rules(
Expand All @@ -122,26 +136,30 @@ def apply_deduplication(self, alert: AlertDto) -> bool:
"is_partial_duplicate": alert.isPartialDuplicate,
},
)
if alert.isFullDuplicate or alert.isPartialDuplicate:
# create deduplication event
create_deduplication_event(
tenant_id=self.tenant_id,
deduplication_rule_id=rule.id,
deduplication_type="full" if alert.isFullDuplicate else "partial",
provider_id=alert.providerId,
provider_type=alert.providerType,
)
# we don't need to check the other rules
break
else:
# create none deduplication event, for statistics
create_deduplication_event(
tenant_id=self.tenant_id,
deduplication_rule_id=rule.id,
deduplication_type="none",
provider_id=alert.providerId,
provider_type=alert.providerType,
)

if AlertDeduplicator.DEDUPLICATION_DISTRIBUTION_ENABLED:
if alert.isFullDuplicate or alert.isPartialDuplicate:
# create deduplication event
create_deduplication_event(
tenant_id=self.tenant_id,
deduplication_rule_id=rule.id,
deduplication_type=(
"full" if alert.isFullDuplicate else "partial"
),
provider_id=alert.providerId,
provider_type=alert.providerType,
)
# we don't need to check the other rules
break
else:
# create none deduplication event, for statistics
create_deduplication_event(
tenant_id=self.tenant_id,
deduplication_rule_id=rule.id,
deduplication_type="none",
provider_id=alert.providerId,
provider_type=alert.providerType,
)

return alert

Expand All @@ -166,11 +184,15 @@ def get_deduplication_rules(
self, tenant_id, provider_id, provider_type
) -> DeduplicationRuleDto:
# try to get the rule from the database
rules = get_custom_deduplication_rules(tenant_id, provider_id, provider_type)
rule = (
get_custom_deduplication_rule(tenant_id, provider_id, provider_type)
if AlertDeduplicator.CUSTOM_DEDUPLICATION_DISTRIBUTION_ENABLED
else None
)

if not rules:
if not rule:
self.logger.debug(
"No custom deduplication rules found, using deafult full deduplication rule",
"No custom deduplication rule found, using deafult full deduplication rule",
extra={
"provider_id": provider_id,
"provider_type": provider_type,
Expand All @@ -189,12 +211,10 @@ def get_deduplication_rules(
"tenant_id": tenant_id,
},
)
#
# check that at least one of them is full deduplication rule
full_deduplication_rules = [rule for rule in rules if rule.full_deduplication]

# if full deduplication rule found, return the rules
if full_deduplication_rules:
return rules
if rule.full_deduplication:
return [rule]

# if not, assign them the default full deduplication rule ignore fields
self.logger.info(
Expand All @@ -203,13 +223,8 @@ def get_deduplication_rules(
default_full_dedup_rule = self._get_default_full_deduplication_rule(
provider_id=provider_id, provider_type=provider_type
)
for rule in rules:
if not rule.full_deduplication:
self.logger.debug(
"Assigning default full deduplication rule ignore fields",
)
rule.ignore_fields = default_full_dedup_rule.ignore_fields
return rules
rule.ignore_fields = default_full_dedup_rule.ignore_fields
return [rule]

def _generate_uuid(self, provider_id, provider_type):
# this is a way to generate a unique uuid for the default deduplication rule per (provider_id, provider_type)
Expand Down Expand Up @@ -269,7 +284,11 @@ def get_deduplications(self) -> list[DeduplicationRuleDto]:
provider_id, provider_type = dd.provider_id, dd.provider_type
dd.id = self._generate_uuid(provider_id, provider_type)
# get custom deduplication rules
custom_deduplications = get_all_deduplication_rules(self.tenant_id)
custom_deduplications = (
get_all_deduplication_rules(self.tenant_id)
if AlertDeduplicator.CUSTOM_DEDUPLICATION_DISTRIBUTION_ENABLED
else []
)
# cast to dto
custom_deduplications_dto = [
DeduplicationRuleDto(
Expand Down Expand Up @@ -347,6 +366,14 @@ def get_deduplications(self) -> list[DeduplicationRuleDto]:

result = []
for dedup in final_deduplications:
self.logger.debug(
"Calculating deduplication stats",
extra={
"deduplication_rule_id": dedup.id,
"tenant_id": self.tenant_id,
"deduplication_stats": deduplication_stats,
},
)
key = dedup.id
full_dedup = deduplication_stats.get(key, {"full_dedup_count": 0}).get(
"full_dedup_count", 0
Expand Down Expand Up @@ -377,7 +404,7 @@ def get_deduplications(self) -> list[DeduplicationRuleDto]:
)
result.append(dedup)

if self.provider_distribution_enabled:
if AlertDeduplicator.DEDUPLICATION_DISTRIBUTION_ENABLED:
for dedup in result:
for pd, stats in deduplication_stats.items():
if pd == f"{dedup.provider_id}_{dedup.provider_type}":
Expand Down
Loading
Loading