Updating Monitoring + CW Constructs #18

rpmcginty · 2024-07-25T23:25:00Z

What's in this Change?

cleaning up monitoring constructs

Testing

deploying changes to merscope pipeline

njmei · 2024-07-26T16:32:32Z

src/aibs_informatics_cdk_lib/constructs_/cw/config_generators/lambda_.py

+    def get_duration_min_metric(
+        self,
+        name_override: Optional[str] = None,
+    ) -> GraphMetricConfig:
+        name = name_override or self.lambda_function_name
+        return GraphMetricConfig(
+            metric="Duration",
+            statistic="Minimum",
+            dimension_map=self.dimension_map,
+            label=f"{name} Min",
+        )


I wonder if we should start with a more minimal set of metrics (maybe just successes, failures, and durations?) and then only add if we know we really need them? Metrics like min duration don't seem the most useful?

These are the metrics you see in lambda monitoring dashboard. So I just replicated what is displayed there. This is the case already for ocs graphs.

I think things like min/max in 5 minute windows help give more insight into whether there are outlier runs. but lets chat at standup

njmei · 2024-07-26T16:35:39Z

src/aibs_informatics_cdk_lib/constructs_/cw/config_generators/sfn.py

+            label=f"{name_override or self.state_machine_name} Started",
+            statistic="Sum",
+            dimension_map=self.dimension_map,
+        )


Same question here, do we need to know number of invocations if we are already logging completions/failures?

This metric gives us a sense as to what long running jobs have started. I added this to OCS because, like analysis jobs, the alignment jobs take a long time to run and I think seeing the start and completion times is helpful to see.

update constructs

06ec42b

rpmcginty force-pushed the feature/update-monitoring-constructs branch from 8b97aaa to 06ec42b Compare July 26, 2024 00:23

rpmcginty requested a review from njmei July 26, 2024 00:24

njmei reviewed Jul 26, 2024

View reviewed changes

njmei self-requested a review July 26, 2024 19:06

njmei approved these changes Jul 26, 2024

View reviewed changes

rpmcginty merged commit b768105 into main Jul 26, 2024
4 checks passed

rpmcginty deleted the feature/update-monitoring-constructs branch July 26, 2024 19:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updating Monitoring + CW Constructs #18

Updating Monitoring + CW Constructs #18

rpmcginty commented Jul 25, 2024 •

edited

Loading

njmei Jul 26, 2024

rpmcginty Jul 26, 2024

njmei Jul 26, 2024

rpmcginty Jul 26, 2024

Updating Monitoring + CW Constructs #18

Updating Monitoring + CW Constructs #18

Conversation

rpmcginty commented Jul 25, 2024 • edited Loading

What's in this Change?

Testing

njmei Jul 26, 2024

Choose a reason for hiding this comment

rpmcginty Jul 26, 2024

Choose a reason for hiding this comment

njmei Jul 26, 2024

Choose a reason for hiding this comment

rpmcginty Jul 26, 2024

Choose a reason for hiding this comment

rpmcginty commented Jul 25, 2024 •

edited

Loading