Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add scope user_id to JWT payload #33455

Merged
merged 14 commits into from
Oct 30, 2023
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
15. Add scope user_id for JWT token
###################################

Status
------

Accepted

Context
-------

In Feb 2018, to enable analytics (Segment) from Microfrontends (MFEs), a ``user_id`` claim was added to the JWT token `in this PR<https://github.com/openedx/edx-platform/pull/19765>`__.

The LMS API `to create authentication tokens`_ is used by external organizations to request a token on behalf of their users, mostly using grant_type ``client_credentials`` in the request. Since ``user_id`` is considered sensitive information, especially when combined with email and username which were already available in the JWT, it was decided to only add the ``user_id`` claim when a ``user_id`` scope was supplied. All MFE JWT cookies, which are known to only be used directly by the user, automatically used the ``user_id`` scope in order to get the required ``user_id`` claim.

No ADR could be found for the Feb 2018 decision detailed above.

In June 2019, an `ADR was captured in ecommerce`_ around the requirements to have the LMS user_id available for requests to ecommerce.

In 2022, the mobile apps switched to using JWTs for authentication. However, these JWTs were missing the ``user_id`` scope and claim required by the ecommerce service.

.. _to create authentication tokens: https://github.com/openedx/edx-platform/blob/caf8e456e28f9b9a1f5fa7186d3d155112fb75be/openedx/core/djangoapps/oauth_dispatch/urls.py#L14
.. _ADR was captured in ecommerce: https://github.com/openedx/ecommerce/blob/master/docs/decisions/0004-unique-identifier-for-users.rst

Decisions
---------

- The original decision to add the ``user_id`` claim to the JWT token using the ``user_id`` scope has been captured in the context of this ADR, because no ADR could be found.
- The scope ``user_id`` will be added to all requests having grant_type ``password`` in the API `/oauth2/access_token/`.

Consequences
------------

- The claim ``user_id`` will be present in the JWT token for all requesters who already have access to the login credentials of the user account.
- The ``user_id`` scope will continue to protect other JWT requests that don't require this sensitive information.
- This pattern could potentially be used to clean-up the manually added ``user_id`` scope for oauth clients involved in the social auth flow in the future.
10 changes: 9 additions & 1 deletion openedx/core/djangoapps/oauth_dispatch/jwt.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
from edx_toggles.toggles import SettingToggle
from jwt import PyJWK
from jwt.utils import base64url_encode
from oauth2_provider.models import Application

from common.djangoapps.student.models import UserProfile, anonymous_id_for_user

Expand Down Expand Up @@ -170,7 +171,8 @@ def _create_jwt(
# Default scopes should only contain non-privileged data.
# Do not be misled by the fact that `email` and `profile` are default scopes. They
# were included for legacy compatibility, even though they contain privileged data.
scopes = scopes or ['email', 'profile']
# The scope `user_id` must be added for requests with grant_type password.
scopes = _update_user_id_in_scopes(scopes or ['email', 'profile'], grant_type)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, looking more closely add the code, I think we should follow the same pattern of either using the explicitly defined scopes, or using the default scopes. Something like the following:

    # Default scopes should only contain non-privileged data.
    # Do not be misled by the fact that `email` and `profile` are default scopes. They
    # were included for legacy compatibility, even though they contain privileged data.
    if grant_type == Application.GRANT_PASSWORD:
        default_scopes = ['user_id', 'email', 'profile']
    else
        default_scopes = ['email', 'profile']
    scopes = scopes or default_scopes

All of this could be packaged into a private method if you wish, but it is more consistent (i.e. won't override explicitly set scopes) and keeps all the scopes code together. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JWT consumes scopes from the access_token payload. By default, scopes will almost never be empty since the default scopes are always added to the access_token (in case of missing explicitly defined scopes).

OAUTH2_DEFAULT_SCOPES = {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. [Nit] Thanks for the clarification. This is just an opinion, but I'd prefer seeing _update_user_id_in_scopes renamed to _get_updated_scopes (or some name like that), and packaging all the scope related code (default + user_id) in one place. If you feel strongly otherwise, no need to update.
  2. I can't think of a clean and simply way to create a new appropriate setting related to user_id, but wondering if we could at least add a comment before
    'user_id': _('Know your user identifier'),
    that reads something like:
# user_id is added in code as a default scope for JWT cookies and all password grant_type JWTs

iat, exp = _compute_time_fields(expires_in)

payload = {
Expand Down Expand Up @@ -285,3 +287,9 @@ def _encode_and_sign(payload, use_asymmetric_key, secret):

jwk = PyJWK(key, algorithm)
return jwt.encode(payload, jwk.key, algorithm=algorithm)


def _update_user_id_in_scopes(scopes, grant_type):
if grant_type == Application.GRANT_PASSWORD and 'user_id' not in scopes:
scopes.append('user_id')
return scopes
3 changes: 2 additions & 1 deletion openedx/core/djangoapps/oauth_dispatch/tests/test_jwt.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ def setUp(self):
super().setUp()
self.user = UserFactory()
self.default_scopes = ['email', 'profile']
self.default_scopes_password_grant_type = ['email', 'profile', 'user_id']

def _create_client(self, oauth_adapter, client_restricted, grant_type=None):
"""
Expand Down Expand Up @@ -176,7 +177,7 @@ def test_password_grant_type(self):
jwt_token_dict = jwt_api.create_jwt_token_dict(token_dict, oauth_adapter, use_asymmetric_key=False)

self.assert_valid_jwt_access_token(
jwt_token_dict["access_token"], self.user, self.default_scopes,
jwt_token_dict["access_token"], self.user, self.default_scopes_password_grant_type,
grant_type='password',
)

Expand Down
39 changes: 30 additions & 9 deletions openedx/core/djangoapps/oauth_dispatch/tests/test_views.py
Original file line number Diff line number Diff line change
Expand Up @@ -174,16 +174,22 @@ def _test_jwt_access_token(self, client_attr, token_type=None, headers=None, gra
expected_default_expires_in = 60 * 60
assert data['expires_in'] == expected_default_expires_in
assert data['token_type'] == 'JWT'
expected_scopes = data['scope'].split(' ')
self._update_expected_scopes_with_user_id(expected_scopes, grant_type)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need help understanding how this isn't just covering up a problem. Shouldn't data which comes from the response contain all the necessary scopes? Something feels fishy about this fix, so maybe you can help explain it a bit better. Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are just the expected scopes.
The scopes from the response are actually stored in the body of the decoded access token itself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a bug:

  • The scopes in the response used to match the scopes in the JWT, but now they don't.
  • I think the fix would be that this comment should be removed, and that we now need to set the scopes in the response to the scopes in the JWT.

Let me know your thoughts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there may be a misunderstanding here:

  • In the tests, Actual scopes are derived by decoding the token here.
  • What we are editing in the tests are the expected scopes, since we added the logic to add the user_id scope to grant_type password requests.
  • In the tests, we are not manipulating the scopes returned in the response.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you call the API from the command line using a password grant type, is the user_id in the scopes in the dict of the response? This dict should be in sync with the scopes in the JWT, and I’m guessing it’s not, but I could be wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your point now, the user_id scope is not added in the response to the API. Updated.

Copy link
Contributor

@robrap robrap Oct 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UPDATED: I'm glad we are on the same page about the API response scopes matching the JWT scopes, but I'm still confused about all the test changes. Shouldn't the test just show that these match, and if they don't match, why don't they?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out, reverted related tests changes.

self.assert_valid_jwt_access_token(
data['access_token'],
self.user,
data['scope'].split(' '),
expected_scopes,
grant_type=grant_type,
should_be_restricted=False,
expires_in=expected_default_expires_in,
should_be_asymmetric_key=asymmetric_jwt
)

def _update_expected_scopes_with_user_id(self, expected_scopes, grant_type):
if grant_type == 'password' and 'user_id' not in expected_scopes:
expected_scopes.append('user_id')

@ddt.data('dot_app')
def test_access_token_fields(self, client_attr):
client = getattr(self, client_attr)
Expand Down Expand Up @@ -258,18 +264,20 @@ def test_restricted_jwt_access_token(self):
response = self._post_request(self.user, self.restricted_dot_app, token_type='jwt')
assert response.status_code == 200
data = json.loads(response.content.decode('utf-8'))

grant_type = 'password'
expected_scopes = data['scope'].split(' ')
self._update_expected_scopes_with_user_id(expected_scopes, grant_type)
assert 'expires_in' in data
assert data['expires_in'] > 0
assert data['token_type'] == 'JWT'
self.assert_valid_jwt_access_token(
data['access_token'],
self.user,
data['scope'].split(' '),
expected_scopes,
should_be_expired=False,
should_be_asymmetric_key=True,
should_be_restricted=True,
grant_type='password'
grant_type=grant_type
)

def test_restricted_access_token(self):
Expand Down Expand Up @@ -322,6 +330,7 @@ def test_jwt_access_token_scopes_and_filters(self, grant_type):
response = self._post_request(self.user, dot_app, token_type='jwt', scope=scopes)
assert response.status_code == 200
data = json.loads(response.content.decode('utf-8'))
self._update_expected_scopes_with_user_id(scopes, grant_type)
self.assert_valid_jwt_access_token(
data['access_token'],
self.user,
Expand Down Expand Up @@ -427,14 +436,20 @@ def _test_jwt_access_token(self, client_attr, token_type=None, headers=None, gra
assert data['expires_in'] > 0
assert data['token_type'] == 'JWT'

expected_scopes = data['scope'].split(' ')
self._update_expected_scopes_with_user_id(expected_scopes, grant_type)
self.assert_valid_jwt_access_token(
data['access_token'],
self.user,
data['scope'].split(' '),
expected_scopes,
grant_type=grant_type,
should_be_asymmetric_key=asymmetric_jwt
)

def _update_expected_scopes_with_user_id(self, expected_scopes, grant_type):
if grant_type == 'password' and 'user_id' not in expected_scopes:
expected_scopes.append('user_id')

@ddt.data('dot_app')
def test_access_token_exchange_calls_dispatched_view(self, client_attr):
client = getattr(self, client_attr)
Expand All @@ -451,11 +466,14 @@ def test_jwt_access_token_exchange_calls_dispatched_view(self, client_attr):
response = self._post_request(self.user, client, token_type='jwt')
assert response.status_code == 200
data = json.loads(response.content.decode('utf-8'))
grant_type = 'password'
expected_scopes = data['scope'].split(' ')
self._update_expected_scopes_with_user_id(expected_scopes, grant_type)
self.assert_valid_jwt_access_token(
data['access_token'],
self.user,
data['scope'].split(' '),
grant_type='password'
expected_scopes,
grant_type=grant_type
)

assert 'expires_in' in data
Expand All @@ -470,11 +488,14 @@ def test_asymmetric_jwt_access_token_exchange_calls_dispatched_view(self, client
response = self._post_request(self.user, client, token_type='jwt', asymmetric_jwt=True)
assert response.status_code == 200
data = json.loads(response.content.decode('utf-8'))
grant_type = 'password'
expected_scopes = data['scope'].split(' ')
self._update_expected_scopes_with_user_id(expected_scopes, grant_type)
self.assert_valid_jwt_access_token(
data['access_token'],
self.user,
data['scope'].split(' '),
grant_type='password',
expected_scopes,
grant_type=grant_type,
should_be_asymmetric_key=True
)

Expand Down
Loading