Skip to content

Commit

Permalink
feat: phase 2 of code owner monitoring rollout
Browse files Browse the repository at this point in the history
Completes code owner monitoring updates, which drops owner
theme and finalizes the code owner span tags. See doc and
ADR updates for more details.

* The code_owner_theme_2 tag was dropped altogether.
* The temporary suffix (_2) was removed from other span
  tags.
* The code_owner (formerly code_owner_2) tag no longer
  includes the theme name.
* The new name for the django setting is
  CODE_OWNER_TO_PATH_MAPPINGS (formerly
  CODE_OWNER_MAPPINGS).
* The django setting CODE_OWNER_THEMES was dropped.
* Updates the generate_code_owner_mappings.py script
  accordingly.

Implements:
- #784
  • Loading branch information
robrap committed Dec 10, 2024
1 parent 4f0e438 commit a96686d
Show file tree
Hide file tree
Showing 15 changed files with 280 additions and 364 deletions.
13 changes: 13 additions & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,19 @@ Change Log
Unreleased
~~~~~~~~~~

[6.1.0] - 2024-12-10
~~~~~~~~~~~~~~~~~~~~
Changed
_______
* Completes code owner monitoring updates, which drops owner theme and finalizes the code owner span tags. See doc and ADR updates for more details.

* The code_owner_theme_2 tag was dropped altogether.
* The temporary suffix (_2) was removed from other span tags.
* The code_owner (formerly code_owner_2) tag no longer includes the theme name.
* The new name for the django setting is CODE_OWNER_TO_PATH_MAPPINGS (formerly CODE_OWNER_MAPPINGS).
* The django setting CODE_OWNER_THEMES was dropped.
* Updates the generate_code_owner_mappings.py script accordingly.

[6.0.0] - 2024-12-05
~~~~~~~~~~~~~~~~~~~~
Removed
Expand Down
2 changes: 1 addition & 1 deletion edx_arch_experiments/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@
A plugin to include applications under development by the architecture team at 2U.
"""

__version__ = '6.0.0'
__version__ = '6.1.0'
4 changes: 2 additions & 2 deletions edx_arch_experiments/datadog_monitoring/README.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Datadog Monitoring
###################

When installed in the LMS as a plugin app, the ``datadog_monitoring`` app adds additional monitoring.
When installed in the LMS as a plugin app, the ``datadog_monitoring`` app adds additional 2U-specific monitoring.

This is where our code_owner_2 monitoring code lives, for example.
This is where our code owner monitoring code lives, for example.
4 changes: 2 additions & 2 deletions edx_arch_experiments/datadog_monitoring/code_owner/datadog.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
"""
Datadog span processor for celery span code owners.
"""
from .utils import set_code_owner_attribute_from_module
from .utils import set_code_owner_span_tags_from_module


class CeleryCodeOwnerSpanProcessor:
Expand All @@ -17,7 +17,7 @@ def on_span_start(self, span):
# We can use this for celery spans, because the resource name is more predictable
# and available from the start. For django requests, we'll instead continue to use
# django middleware for setting code owner.
set_code_owner_attribute_from_module(span.resource)
set_code_owner_span_tags_from_module(span.resource)

def on_span_finish(self, span):
pass
Expand Down
183 changes: 42 additions & 141 deletions edx_arch_experiments/datadog_monitoring/code_owner/utils.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
"""
Utilities for monitoring code_owner_2
Utilities for monitoring code_owner
"""
import logging

Expand All @@ -19,9 +19,6 @@ def get_code_owner_from_module(module):
this lookup would match on 'openedx.features.discounts' before
'openedx.features', because the former is more specific.
See how to:
https://github.com/openedx/edx-django-utils/blob/master/edx_django_utils/monitoring/docs/how_tos/add_code_owner_custom_attribute_to_an_ida.rst
"""
if not module:
return None
Expand Down Expand Up @@ -54,7 +51,7 @@ def is_code_owner_mappings_configured():

def get_code_owner_mappings():
"""
Returns the contents of the CODE_OWNER_MAPPINGS Django Setting, processed
Returns the contents of the CODE_OWNER_TO_PATH_MAPPINGS Django Setting, processed
for efficient lookup by path.
Returns:
Expand All @@ -81,12 +78,12 @@ def get_code_owner_mappings():
# processed map. Worst case, it is processed more than once at start-up.
path_to_code_owner_mapping = {}

# .. setting_name: CODE_OWNER_MAPPINGS
# .. setting_name: CODE_OWNER_TO_PATH_MAPPINGS
# .. setting_default: None
# .. setting_description: Used for monitoring and reporting of ownership. Use a
# dict with keys of code owner name and value as a list of dotted path
# module names owned by the code owner.
code_owner_mappings = getattr(settings, 'CODE_OWNER_MAPPINGS', None)
code_owner_mappings = getattr(settings, 'CODE_OWNER_TO_PATH_MAPPINGS', None)
if code_owner_mappings is None:
return None

Expand All @@ -97,72 +94,73 @@ def get_code_owner_mappings():
path_to_code_owner_mapping[path] = code_owner
except TypeError as e:
log.exception(
'Error processing CODE_OWNER_MAPPINGS. {}'.format(e) # pylint: disable=logging-format-interpolation
'Error processing CODE_OWNER_TO_PATH_MAPPINGS. {}'.format(e) # pylint: disable=logging-format-interpolation
)
raise e

_PATH_TO_CODE_OWNER_MAPPINGS = path_to_code_owner_mapping
return _PATH_TO_CODE_OWNER_MAPPINGS


def set_code_owner_attribute_from_module(module):
def set_code_owner_span_tags_from_module(module):
"""
Updates the code_owner_2 and code_owner_2_module custom attributes.
Celery tasks or other non-web functions do not use middleware, so we need
an alternative way to set the code_owner_2 custom attribute.
Note: These settings will be overridden by the CodeOwnerMonitoringMiddleware.
This method can't be used to override web functions at this time.
Updates the code_owner and code_owner_module custom span tags.
Usage::
set_code_owner_2_attribute_from_module(__name__)
set_code_owner_span_tags_from_module(__name__)
"""
set_custom_attribute('code_owner_2_module', module)
# .. custom_attribute_name: code_owner_module
# .. custom_attribute_description: The module used to determine the code_owner. This can
# be useful for debugging issues for missing code owner span tags.
set_custom_attribute('code_owner_module', module)
code_owner = get_code_owner_from_module(module)

if code_owner:
set_code_owner_custom_attributes(code_owner)
set_code_owner_custom_span_tags(code_owner)


def set_code_owner_custom_attributes(code_owner):
def set_code_owner_custom_span_tags(code_owner):
"""
Sets custom metrics for code_owner_2, code_owner_2_theme, and code_owner_2_squad
Sets custom span tags for code_owner, and code_owner_squad
"""
if not code_owner: # pragma: no cover
return
set_custom_attribute('code_owner_2', code_owner)
theme = _get_theme_from_code_owner(code_owner)
if theme:
set_custom_attribute('code_owner_2_theme', theme)
squad = _get_squad_from_code_owner(code_owner)
if squad:
set_custom_attribute('code_owner_2_squad', squad)

# .. custom_attribute_name: code_owner
# .. custom_attribute_description: The squad owner name for the tagged span.
set_custom_attribute('code_owner', code_owner)
# .. custom_attribute_name: code_owner_squad
# .. custom_attribute_description: Deprecated code_owner_squad is now redundant
# to the code_owner span tag.
set_custom_attribute('code_owner_squad', code_owner)

# .. custom_attribute_name: code_owner_plugin
# .. custom_attribute_description: This is a temporary span tag to roll out the
# the switch from edx-django-utils to this plugin. If this span tag is True,
# the plugin has added the above custom span tags (possibly in addition to
# edx-django-utils). If the code_owner_theme span tag is also seen, then
# edx-django-utils is also adding these span tags. If not, only the plugin is
# creating these tags.
set_custom_attribute('code_owner_plugin', True)

def set_code_owner_attribute(request):

def set_code_owner_span_tags_from_request(request):
"""
Sets the code_owner_2 custom attribute for the request.
Sets the code_owner custom span tag for the request.
"""
code_owner = None
module = _get_module_from_request(request)
if module:
code_owner = get_code_owner_from_module(module)

if code_owner:
set_code_owner_custom_attributes(code_owner)
set_code_owner_span_tags_from_module(module)


def _get_module_from_request(request):
"""
Get the module from the request path or the current transaction.
Side-effects:
Sets code_owner_2_module custom attribute, used to determine code_owner_2.
If module was not found, may set code_owner_2_path_error custom attribute
if applicable.
Sets code_owner_path_error custom span tag if applicable.
Returns:
str: module name or None if not found
Expand All @@ -172,14 +170,14 @@ def _get_module_from_request(request):
return None

module, path_error = _get_module_from_request_path(request)
if module:
set_custom_attribute('code_owner_2_module', module)
return module

# monitor errors if module was not found
if path_error:
set_custom_attribute('code_owner_2_path_error', path_error)
return None
# .. custom_attribute_name: code_owner_path_error
# .. custom_attribute_description: Error details if the module can't be found. This can
# be useful for debugging issues for missing code owner span tags.
set_custom_attribute('code_owner_path_error', path_error)

return module


def _get_module_from_request_path(request):
Expand All @@ -204,100 +202,3 @@ def clear_cached_mappings():
"""
global _PATH_TO_CODE_OWNER_MAPPINGS
_PATH_TO_CODE_OWNER_MAPPINGS = None
global _CODE_OWNER_TO_THEME_AND_SQUAD_MAPPINGS
_CODE_OWNER_TO_THEME_AND_SQUAD_MAPPINGS = None


# Cached lookup table for code owner theme and squad given a code owner.
# - Although code owner is "theme-squad", a hyphen may also be in the theme or squad name, so this ensures we get both
# correctly from config.
# Do not access this directly, but instead use get_code_owner_theme_squad_mappings.
_CODE_OWNER_TO_THEME_AND_SQUAD_MAPPINGS = None


def get_code_owner_theme_squad_mappings():
"""
Returns the contents of the CODE_OWNER_THEMES Django Setting, processed
for efficient lookup by path.
Returns:
(dict): dict mapping code owners to a dict containing the squad and theme, or
an empty dict if there are no configured mappings.
Example return value::
{
'theme-x-team-red': {
'theme': 'theme-x',
'squad': 'team-red',
},
'theme-x-team-blue': {
'theme': 'theme-x',
'squad': 'team-blue',
},
}
"""
global _CODE_OWNER_TO_THEME_AND_SQUAD_MAPPINGS

# Return cached processed mappings if already processed
if _CODE_OWNER_TO_THEME_AND_SQUAD_MAPPINGS is not None:
return _CODE_OWNER_TO_THEME_AND_SQUAD_MAPPINGS

# Uses temporary variable to build mappings to avoid multi-threading issue with a partially
# processed map. Worst case, it is processed more than once at start-up.
code_owner_to_theme_and_squad_mapping = {}

# .. setting_name: CODE_OWNER_THEMES
# .. setting_default: None
# .. setting_description: Used for monitoring and reporting of ownership. Use a
# dict with keys of code owner themes and values as a list of code owner names
# including theme and squad, separated with a hyphen.
code_owner_themes = getattr(settings, 'CODE_OWNER_THEMES', {})

try:
for theme in code_owner_themes:
code_owner_list = code_owner_themes[theme]
for code_owner in code_owner_list:
squad = code_owner.split(theme + '-', 1)[1]
code_owner_details = {
'theme': theme,
'squad': squad,
}
code_owner_to_theme_and_squad_mapping[code_owner] = code_owner_details
except TypeError as e:
log.exception(
'Error processing CODE_OWNER_THEMES setting. {}'.format(e) # pylint: disable=logging-format-interpolation
)
raise e

_CODE_OWNER_TO_THEME_AND_SQUAD_MAPPINGS = code_owner_to_theme_and_squad_mapping
return _CODE_OWNER_TO_THEME_AND_SQUAD_MAPPINGS


def _get_theme_from_code_owner(code_owner):
"""
Returns theme for a code_owner (e.g. 'theme-my-squad' => 'theme')
"""
mappings = get_code_owner_theme_squad_mappings()
if mappings is None: # pragma: no cover
return None

if code_owner in mappings:
return mappings[code_owner]['theme']

return None


def _get_squad_from_code_owner(code_owner):
"""
Returns squad for a code_owner (e.g. 'theme-my-squad' => 'my-squad')
"""
mappings = get_code_owner_theme_squad_mappings()
if mappings is None: # pragma: no cover
return None

if code_owner in mappings:
return mappings[code_owner]['squad']

return None
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
Monitoring by Code Owner
************************

Status
======

Accepted

Context
=======

It is difficult for different teams to have team-based on-calls rotations, alerting and monitoring for various parts of the edx-platform (specifically LMS).

* The original decision to add code_owner custom span tags (custom attributes) was documented in `edx-platform in 0001-monitoring-by-code-owner`_.
* The decision to move the code for reuse across IDAs was documented in `edx-django-utils in 0001-monitoring-by-code-owner.rst`_.
* The decision for how to implement code owner details for celery tasks was documented in `0003-code-owner-for-celery-tasks_, and was limited by New Relic's instrumentation.
* The decision to break up the ``code_owner`` custom span tag (custom attribute) into ``code_owner_squad`` and ``code_owner_theme`` tags was documented in `0004-code-owner-theme-and-squad`_.

Some changes or clarifications since this time:

* It turns out that this functionality is only really useful for the edx-platform LMS. Our other services (IDAs) are small enough to keep to a single owner, or to solve monitoring issues in other ways.
* It is likely that the ``code_owner`` code is only really needed by 2U.
* 2U wants to drop owner themes from its code_owner custom span tag.
* 2U has switched to Datadog, which has slightly different capabilities from New Relic.

* Note that Datadog has custom span tags, where New Relic has custom attributes to refer to its tagging capabilities.

.. _edx-platform in 0001-monitoring-by-code-owner: https://github.com/openedx/edx-platform/blob/f29e418264f374099930a5b1f5b8345c569892e9/lms/djangoapps/monitoring/docs/decisions/0001-monitoring-by-code-owner.rst
.. _edx-django-utils in 0001-monitoring-by-code-owner.rst: https://github.com/openedx/edx-django-utils/blob/a1a1ec95d7c1d4767deb578748153c99c9562a04/edx_django_utils/monitoring/docs/decisions/0001-monitoring-by-code-owner.rst
.. _0003-code-owner-for-celery-tasks: https://github.com/openedx/edx-django-utils/blob/a1a1ec95d7c1d4767deb578748153c99c9562a04/edx_django_utils/monitoring/docs/decisions/0003-code-owner-for-celery-tasks.rst
.. _0004-code-owner-theme-and-squad: https://github.com/openedx/edx-django-utils/blob/a1a1ec95d7c1d4767deb578748153c99c9562a04/edx_django_utils/monitoring/docs/decisions/0004-code-owner-theme-and-squad.rst

Decision
========

2U has moved its code owner monitoring implementation to the datadog_monitoring plugin.

* The owner theme name has been dropped from the ``code_owner`` custom span tag value in this new implementation.
* The ``code_owner_theme`` span tag has been dropped altogether from this new implementation.
* The now deprecated ``code_owner_squad`` span tag, which is redundant with the updated ``code_owner`` tag, will continue to be supported for backward compatibility.
* A Datadog span processor was used to add the code owner span tags for celery tasks, so there is no longer a need for a special decorator on each celery task.
* A new capability added to edx-django-utils to add `monitoring signals for plugins`_ is used to monitor Django requests.

Also, a new search script was implemented in this repository: `search_datadog.rst`_.

.. _monitoring signals for plugins: https://github.com/openedx/edx-django-utils/pull/467
.. _search_datadog.rst: https://github.com/edx/edx-arch-experiments/blob/main/edx_arch_experiments/datadog_monitoring/scripts/datadog_search.py

Consequences
============

- In addition to having greater flexibility to update these custom tags as-needed for 2U, we can also DEPR the code owner functionality in the Open edX codebase, where it is not likely to be needed.
- Spreadsheet changes will no longer be required when a squad moves from one part of the organization to another (e.g. changes themes).
- However, without including themes, it may take additional time to learn about a squad's place in the organization when seen in the code_owner span tag. For example, it will not be as immediately clear when dealing with an enterprise squad, unless you are familiar with all of the squad names.
Loading

0 comments on commit a96686d

Please sign in to comment.