Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move user-retirement scripts docs near code #34695

Merged
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@
'sphinx.ext.autodoc',
'sphinx.ext.coverage',
'sphinx.ext.doctest',
'sphinx.ext.graphviz',
'sphinx.ext.ifconfig',
'sphinx.ext.intersphinx',
'sphinx.ext.mathjax',
Expand Down
6 changes: 3 additions & 3 deletions scripts/user_retirement/README.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
User Retirement Scripts
=======================

`This <https://github.com/openedx/edx-platform/tree/master/scripts/user_retirement>`_ directory contains python scripts which are migrated from the `tubular <https://github.com/openedx/tubular/tree/master/scripts>`_ respository.
`This <https://github.com/openedx/edx-platform/tree/master/scripts/user_retirement>`_ directory contains python scripts which are migrated from the `tubular <https://github.com/openedx/tubular/tree/master/scripts>`_ respository.
These scripts are intended to drive the user retirement workflow which involves handling the deactivation or removal of user accounts as part of the platform's management process.

These scripts could be called from any automation/CD framework.
Expand Down Expand Up @@ -49,9 +49,9 @@ In-depth Documentation and Configuration Steps

For in-depth documentation and essential configurations follow these docs

`Documentation <https://edx.readthedocs.io/projects/edx-installing-configuring-and-running/en/latest/configuration/user_retire/index.html>`_
`Documentation <https://docs.openedx.org/projects/edx-platform/en/latest/references/docs/scripts/user_retirement/docs/index.html>`_

`Configuration Docs <https://edx.readthedocs.io/projects/edx-installing-configuring-and-running/en/latest/configuration/user_retire/driver_setup.html>`_
`Configuration Docs <https://docs.openedx.org/projects/edx-platform/en/latest/references/docs/scripts/user_retirement/docs/driver_setup.html>`_


Execute Script
Expand Down
134 changes: 134 additions & 0 deletions scripts/user_retirement/docs/driver_setup.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
.. _driver-setup:

#############################################
Setting Up the User Retirement Driver Scripts
#############################################

`scripts/user_retirement <https://github.com/openedx/edx-platform/tree/master/scripts/user_retirement>`_
is a directory of Python 3 scripts designed to plug into various automation
feanil marked this conversation as resolved.
Show resolved Hide resolved
tooling. It also contains readme file having details of how to run the scripts.
Included in this directory are two scripts intended to drive the user
retirement workflow.

``get_learners_to_retire.py``
Generates a list of users that are ready for immediate retirement. Users
are "ready" after a certain number of days spent in the ``PENDING`` state,
specified by the ``--cool_off_days`` argument. Produces an output intended
for consumption by Jenkins in order to spawn separate downstream builds for
each user.
``retire_one_learner.py``
Retires the user specified by the ``--username`` argument.

These two scripts share a required ``--config_file`` argument, which specifies
the driver configuration file for your environment (for example, production).
This configuration file is a YAML file that contains LMS auth secrets, API URLs,
and retirement pipeline stages specific to that environment. Here is an example
of a driver configuration file.

.. code-block:: yaml

client_id: <client ID for the retirement service user>
client_secret: <client secret for the retirement service user>

base_urls:
lms: https://courses.example.com/
ecommerce: https://ecommerce.example.com/
credentials: https://credentials.example.com/

retirement_pipeline:
- ['RETIRING_EMAIL_LISTS', 'EMAIL_LISTS_COMPLETE', 'LMS', 'retirement_retire_mailings']
- ['RETIRING_ENROLLMENTS', 'ENROLLMENTS_COMPLETE', 'LMS', 'retirement_unenroll']
- ['RETIRING_LMS_MISC', 'LMS_MISC_COMPLETE', 'LMS', 'retirement_lms_retire_misc']
- ['RETIRING_LMS', 'LMS_COMPLETE', 'LMS', 'retirement_lms_retire']

The ``client_id`` and ``client_secret`` keys contain the oauth credentials.
These credentials are simply copied from the output of the
``create_dot_application`` management command described in
:ref:`retirement-service-user`.

The ``base_urls`` section in the configuration file defines the mappings of
IDA to base URLs used by the scripts to construct API URLs. Only the LMS is
mandatory here, but if any of your pipeline states contain API calls to other
services, those services must also be present in the ``base_urls`` section.

The ``retirement_pipeline`` section defines the steps, state names, and order
of execution for each environment. Each item is a list in the form of:

#. Start state name
#. End state name
#. IDA to call against (LMS, ECOMMERCE, or CREDENTIALS currently)
#. Method name to call in
`edx_api.py <https://github.com/openedx/edx-platform/blob/master/scripts/user_retirement/utils/edx_api.py>`_

For example: ``['RETIRING_CREDENTIALS', 'CREDENTIALS_COMPLETE', 'CREDENTIALS',
'retire_learner']`` will set the user's state to ``RETIRING_CREDENTIALS``, call
a pre-instantiated ``retire_learner`` method in the ``CredentialsApi``, then set
the user's state to ``CREDENTIALS_COMPLETE``.

********
Examples
********

The following are some examples of how to use the driver scripts.

==================
Set Up Environment
==================

Follow this `readme <https://github.com/openedx/edx-platform/tree/master/scripts/user_retirement#readme>`_ to set up your execution environment.

=========================
List of Targeted Learners
=========================

Generate a list of learners that are ready for retirement (those learners who
have selected and confirmed account deletion and have been in the ``PENDING``
state for the time specified ``cool_off_days``).

.. code-block:: bash

mkdir learners_to_retire
get_learners_to_retire.py \
--config_file=path/to/config.yml \
--output_dir=learners_to_retire \
--cool_off_days=5

=====================
Run Retirement Script
=====================

After running these commands, the ``learners_to_retire`` directory contains
several INI files, each containing a single line in the form of ``USERNAME
=<username-of-learner>``. Iterate over these files while executing the
``retire_one_learner.py`` script on each learner with a command like the following.

.. code-block:: bash

retire_one_learner.py \
--config_file=path/to/config.yml \
--username=<username-of-learner-to-retire>


**************************************************
Using the Driver Scripts in an Automated Framework
**************************************************

At edX, we call the user retirement scripts from
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This document is for openedx so I think this needs to be re worded, something like "An example of how these scripts can be run from automation frameworks like Jenkins can be found ..."

Copy link
Contributor Author

@farhan farhan May 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@feanil
These docs are not accessible to the open-source community.
Shouldn't we delete them all the way or mention clearly that it's for internal use of 2U only?

`Jenkins <https://jenkins.io/>`_ jobs on one of our internal Jenkins
services. The user retirement driver scripts are intended to be agnostic
about which automation framework you use, but they were only fully tested
from Jenkins.

For more information about how we execute these scripts at edX, see the
following wiki articles:

* `User Retirement Jenkins Implementation <https://openedx.atlassian.net/wiki/spaces/PLAT/pages/704872737/User+Retirement+Jenkins+Implementation>`_
* `How to: retirement Jenkins jobs development and testing <https://openedx.atlassian.net/wiki/spaces/PLAT/pages/698221444/How+to+retirement+Jenkins+jobs+development+and+testing>`_

And check out the Groovy DSL files we use to seed these jobs:

* `platform/jobs/RetirementJobs.groovy in edx/jenkins-job-dsl <https://github.com/edx/jenkins-job-dsl/blob/master/platform/jobs/RetirementJobs.groovy>`_
* `platform/jobs/RetirementJobEdxTriggers.groovy in edx/jenkins-job-dsl <https://github.com/edx/jenkins-job-dsl/blob/master/platform/jobs/RetirementJobEdxTriggers.groovy>`_

.. include:: ../../../../links/links.rst

117 changes: 117 additions & 0 deletions scripts/user_retirement/docs/implementation_overview.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
.. _Implmentation:

#######################
Implementation Overview
#######################

In the Open edX platform, the user experience is enabled by several
services, such as LMS, Studio, ecommerce, credentials, discovery, and more.
Personally Identifiable Identification (PII) about a user can exist in many of
these services. As a consequence, to remove a user's PII, you must be able
to request each service containing PII to remove, delete, or unlink the
data for that user in that service.

In the user retirement feature, a centralized process (the *driver* scripts)
orchestrates all of these requests. For information about how to configure the
driver scripts, see :ref:`driver-setup`.

****************************
The User Retirement Workflow
****************************

The user retirement workflow is a configurable pipeline of building-block
APIs. These APIs are used to:

* "Forget" a retired user's PII
* Prevent a retired user from logging back in
* Prevent re-use of the username or email address of a retired user

Depending on which third parties a given Open edX instance integrates with,
the user retirement process may need to call out to external services or to
generate reports for later processing. Any such reports must subsequently be
destroyed.

Configurability and adaptability were design goals from the beginning, so this
user retirement tooling should be able to accommodate a wide range of Open edX
sites and custom use cases.

The workflow is designed to be linear and rerunnable, allowing recovery and
continuation in cases where a particular stage fails. Each user who has
requested retirement will be individually processed through this workflow, so
multiple users could be in the same state simultaneously. The LMS is the
authoritative source of information about the state of each user in the
retirement process, and the arbiter of state progressions, using the
``UserRetirementStatus`` model and associated APIs. The LMS also holds a
table of the states themselves (the ``RetirementState`` model), rather than
hard-coding the states. This was done because we cannot predict all the
possible states required by all members of the Open edX community.

This example state diagram outlines the pathways users follow throughout the
workflow:

.. digraph:: retirement_states_example
:align: center

ranksep = "0.3";

node[fontname=Courier,fontsize=12,shape=box,group=main]
{ rank = same INIT[style=invis] PENDING }
INIT -> PENDING;
"..."[shape=none]
PENDING -> RETIRING_ENROLLMENTS -> ENROLLMENTS_COMPLETE -> RETIRING_FORUMS -> FORUMS_COMPLETE -> "..." -> COMPLETE;

node[group=""];
RETIRING_ENROLLMENTS -> ERRORED;
RETIRING_FORUMS -> ERRORED;
PENDING -> ABORTED;

subgraph cluster_terminal_states {
label = "Terminal States";
labelloc = b // put label at bottom
{rank = same ERRORED COMPLETE ABORTED}
}

Unless an error occurs internal to the user retirement tooling, a user's
retirement state should always land in one of the terminal states. At that
point, either their entry should be cleaned up from the
``UserRetirementStatus`` table or, if the state is ``ERRORED``, the
administrator needs to examine the error and resolve it. For more information,
see :ref:`recovering-from-errored`.

*******************
The User Experience
*******************

From the learner's perspective, the vast majority of this process is obscured.
The Account page contains a new section titled **Delete My Account**. In this
section, a learner may click the **Delete My Account** button and enter
their password to confirm their request. Subsequently, all of the learner's
browser sessions are logged off, and they become locked out of their account.

An informational email is immediately sent to the learner to confirm the
deletion of their account. After this email is sent, the learner has a limited
amount of time (defined by the ``--cool_off_days`` argument described in
:ref:`driver-setup`) to contact the site administrators and rescind their
request.

At this point, the learner's account has been deactivated, but *not* retired.
An entry in the ``UserRetirementStatus`` table is added, and their state set to
``PENDING``.

By default, the **Delete My Account** section is visible and the button is
enabled, allowing account deletions to queue up. The
``ENABLE_ACCOUNT_DELETION`` feature in django settings toggles the visibility
of this section. See :ref:`django-settings`.

================
Third Party Auth
================

Learners who registered using social authentication must first unlink their
LMS account from their third-party account. For those learners, the **Delete
My Account** button will be disabled until they do so; meanwhile, they will be
instructed to follow the procedure in this help center article: `How do I link
or unlink my edX account to a social media
account? <https://support.edx.org/hc/en-us/articles/207206067>`_.

.. include:: ../../../../links/links.rst
38 changes: 38 additions & 0 deletions scripts/user_retirement/docs/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
.. _Enabling User Retirement:

####################################
Enabling the User Retirement Feature
####################################

There have been many changes to privacy laws (for example, GDPR or the
European Union General Data Protection Regulation) intended to change the way
that businesses think about and handle Personally Identifiable Information
(PII).

As a step toward enabling Open edX to support some of the key updates in privacy
laws, edX has implemented APIs and tooling that enable Open edX instances to
retire registered users. When you implement this user retirement feature, your
Open edX instance can automatically erase PII for a given user from systems that
are internal to Open edX (for example, the LMS, forums, credentials, and other
independently deployable applications (IDAs)), as well as external systems, such
as third-party marketing services.

This section is intended not only for instructing Open edX admins to perform
the basic setup, but also to offer some insight into the implementation of the
user retirement feature in order to help the Open edX community build
additional APIs and states that meet their special needs. Custom code,
plugins, packages, or XBlocks in your Open edX instance might store PII, but
this feature will not magically find and clean up that PII. You may need to
create your own custom code to include PII that is not covered by the user
retirement feature.

.. toctree::
:maxdepth: 1

implementation_overview
service_setup
driver_setup
special_cases

.. include:: ../../../../links/links.rst

Loading
Loading