Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add scope user_id to JWT payload #33455

Merged
merged 14 commits into from
Oct 30, 2023
1 change: 1 addition & 0 deletions lms/envs/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -1251,6 +1251,7 @@
'certificates:read': _('Retrieve your course certificates'),
'grades:read': _('Retrieve your grades for your enrolled courses'),
'tpa:read': _('Retrieve your third-party authentication username mapping'),
# user_id is added in code as a default scope for JWT cookies and all password grant_type JWTs
'user_id': _('Know your user identifier'),
}),
'DEFAULT_SCOPES': OAUTH2_DEFAULT_SCOPES,
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
15. Add scope user_id for JWT token
###################################

Status
------

Accepted

Context
-------

In Feb 2018, to enable analytics (Segment) from Microfrontends (MFEs), a ``user_id`` claim was added to the JWT token `in this PR<https://github.com/openedx/edx-platform/pull/19765>`__.

The LMS API `to create authentication tokens`_ is used by external organizations to request a token on behalf of their users, mostly using grant_type ``client_credentials`` in the request. Since ``user_id`` is considered sensitive information, especially when combined with email and username which were already available in the JWT, it was decided to only add the ``user_id`` claim when a ``user_id`` scope was supplied. All MFE JWT cookies, which are known to only be used directly by the user, automatically used the ``user_id`` scope in order to get the required ``user_id`` claim.

No ADR could be found for the Feb 2018 decision detailed above.

In June 2019, an `ADR was captured in ecommerce`_ around the requirements to have the LMS user_id available for requests to ecommerce.

In 2022, the mobile apps switched to using JWTs for authentication. However, these JWTs were missing the ``user_id`` scope and claim required by the ecommerce service.

.. _to create authentication tokens: https://github.com/openedx/edx-platform/blob/caf8e456e28f9b9a1f5fa7186d3d155112fb75be/openedx/core/djangoapps/oauth_dispatch/urls.py#L14
.. _ADR was captured in ecommerce: https://github.com/openedx/ecommerce/blob/master/docs/decisions/0004-unique-identifier-for-users.rst

Decisions
---------

- The original decision to add the ``user_id`` claim to the JWT token using the ``user_id`` scope has been captured in the context of this ADR, because no ADR could be found.
- The scope ``user_id`` will be added to all requests having grant_type ``password`` in the API `/oauth2/access_token/`.

Consequences
------------

- The claim ``user_id`` will be present in the JWT token for all requesters who already have access to the login credentials of the user account.
- The ``user_id`` scope will continue to protect other JWT requests that don't require this sensitive information.
- This pattern could potentially be used to clean-up the manually added ``user_id`` scope for oauth clients involved in the social auth flow in the future.
25 changes: 19 additions & 6 deletions openedx/core/djangoapps/oauth_dispatch/jwt.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
from edx_toggles.toggles import SettingToggle
from jwt import PyJWK
from jwt.utils import base64url_encode
from oauth2_provider.models import Application

from common.djangoapps.student.models import UserProfile, anonymous_id_for_user

Expand Down Expand Up @@ -79,10 +80,11 @@ def create_jwt_token_dict(token_dict, oauth_adapter, use_asymmetric_key=None):
# .. custom_attribute_name: create_jwt_grant_type
# .. custom_attribute_description: The grant type of the newly created JWT.
set_custom_attribute('create_jwt_grant_type', grant_type)
scopes = token_dict['scope'].split(' ')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like this should work similarly to jwt_expires_in, where it is processed once here and used twice below, in the JWT and the response dict.
Note: You may want to leave the old scopes = scopes or ['email', 'profile'] as it was in _create_jwt just in case, but otherwise, this will make it more clear that these two values should be matching.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, updated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I’d do this in one statement, but it is just a personal preference, so you can decide.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Altered


jwt_access_token = _create_jwt(
access_token.user,
scopes=token_dict['scope'].split(' '),
scopes=scopes,
expires_in=jwt_expires_in,
use_asymmetric_key=use_asymmetric_key,
is_restricted=oauth_adapter.is_client_restricted(client),
Expand All @@ -91,11 +93,11 @@ def create_jwt_token_dict(token_dict, oauth_adapter, use_asymmetric_key=None):
)

jwt_token_dict = token_dict.copy()
# Note: only "scope" is not overwritten at this point.
Comment on lines 95 to -94
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is true that we are now overwriting all response values, we no longer need to copy the dict. We could just have:

jwt_token_dict = {
    ...
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refresh_token is still not overwritten in token_dict.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some additional thoughts:

  1. Can you restore the comment, and mention the refresh_token?
  2. I’m wondering if it would be best if the refresh_token had the extra scope as well, and if we didn’t need to add it here, because it would have been added earlier?
  3. If we called the api with the three password grant scopes explicitly (as a test from the command line), does it work or complain about the user_id request? If it complains, does that indicate where the scope update should happen?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Comment restored.
  2. The scope is only added in the JWT, it is not added in the access token or refresh token since it's not required there in this use case.
  3. If we call the API with payload having scopes set to grant types email, profile and user_id, it works in the same fashion and does not error out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think #2 would be more accurate/consistent, because the refresh token would match the actual scopes.
That said, I agree that this shouldn't be a blocker, so I added a potential code comment just to capture what is going on.

jwt_token_dict.update({
"access_token": jwt_access_token,
"token_type": "JWT",
"expires_in": jwt_expires_in,
"scope": ' '.join(_get_updated_scopes(scopes, grant_type)),
})
return jwt_token_dict

Expand Down Expand Up @@ -167,10 +169,7 @@ def _create_jwt(
else:
increment('create_symmetric_jwt_count')

# Default scopes should only contain non-privileged data.
# Do not be misled by the fact that `email` and `profile` are default scopes. They
# were included for legacy compatibility, even though they contain privileged data.
scopes = scopes or ['email', 'profile']
scopes = _get_updated_scopes(scopes, grant_type)
iat, exp = _compute_time_fields(expires_in)

payload = {
Expand Down Expand Up @@ -285,3 +284,17 @@ def _encode_and_sign(payload, use_asymmetric_key, secret):

jwk = PyJWK(key, algorithm)
return jwt.encode(payload, jwk.key, algorithm=algorithm)


def _get_updated_scopes(scopes, grant_type):
"""
Default scopes should only contain non-privileged data.
Do not be misled by the fact that `email` and `profile` are default scopes.
They were included for legacy compatibility, even though they contain privileged
data. The scope `user_id` must be added for requests with grant_type password.
"""
scopes = scopes or ['email', 'profile']

if grant_type == Application.GRANT_PASSWORD and 'user_id' not in scopes:
scopes.append('user_id')
return scopes
3 changes: 2 additions & 1 deletion openedx/core/djangoapps/oauth_dispatch/tests/test_jwt.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ def setUp(self):
super().setUp()
self.user = UserFactory()
self.default_scopes = ['email', 'profile']
self.default_scopes_password_grant_type = ['email', 'profile', 'user_id']

def _create_client(self, oauth_adapter, client_restricted, grant_type=None):
"""
Expand Down Expand Up @@ -176,7 +177,7 @@ def test_password_grant_type(self):
jwt_token_dict = jwt_api.create_jwt_token_dict(token_dict, oauth_adapter, use_asymmetric_key=False)

self.assert_valid_jwt_access_token(
jwt_token_dict["access_token"], self.user, self.default_scopes,
jwt_token_dict["access_token"], self.user, self.default_scopes_password_grant_type,
grant_type='password',
)

Expand Down
39 changes: 30 additions & 9 deletions openedx/core/djangoapps/oauth_dispatch/tests/test_views.py
Original file line number Diff line number Diff line change
Expand Up @@ -174,16 +174,22 @@ def _test_jwt_access_token(self, client_attr, token_type=None, headers=None, gra
expected_default_expires_in = 60 * 60
assert data['expires_in'] == expected_default_expires_in
assert data['token_type'] == 'JWT'
expected_scopes = data['scope'].split(' ')
self._update_expected_scopes_with_user_id(expected_scopes, grant_type)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need help understanding how this isn't just covering up a problem. Shouldn't data which comes from the response contain all the necessary scopes? Something feels fishy about this fix, so maybe you can help explain it a bit better. Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are just the expected scopes.
The scopes from the response are actually stored in the body of the decoded access token itself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a bug:

  • The scopes in the response used to match the scopes in the JWT, but now they don't.
  • I think the fix would be that this comment should be removed, and that we now need to set the scopes in the response to the scopes in the JWT.

Let me know your thoughts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there may be a misunderstanding here:

  • In the tests, Actual scopes are derived by decoding the token here.
  • What we are editing in the tests are the expected scopes, since we added the logic to add the user_id scope to grant_type password requests.
  • In the tests, we are not manipulating the scopes returned in the response.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you call the API from the command line using a password grant type, is the user_id in the scopes in the dict of the response? This dict should be in sync with the scopes in the JWT, and I’m guessing it’s not, but I could be wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your point now, the user_id scope is not added in the response to the API. Updated.

Copy link
Contributor

@robrap robrap Oct 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UPDATED: I'm glad we are on the same page about the API response scopes matching the JWT scopes, but I'm still confused about all the test changes. Shouldn't the test just show that these match, and if they don't match, why don't they?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out, reverted related tests changes.

self.assert_valid_jwt_access_token(
data['access_token'],
self.user,
data['scope'].split(' '),
expected_scopes,
grant_type=grant_type,
should_be_restricted=False,
expires_in=expected_default_expires_in,
should_be_asymmetric_key=asymmetric_jwt
)

def _update_expected_scopes_with_user_id(self, expected_scopes, grant_type):
if grant_type == 'password' and 'user_id' not in expected_scopes:
expected_scopes.append('user_id')

@ddt.data('dot_app')
def test_access_token_fields(self, client_attr):
client = getattr(self, client_attr)
Expand Down Expand Up @@ -258,18 +264,20 @@ def test_restricted_jwt_access_token(self):
response = self._post_request(self.user, self.restricted_dot_app, token_type='jwt')
assert response.status_code == 200
data = json.loads(response.content.decode('utf-8'))

grant_type = 'password'
expected_scopes = data['scope'].split(' ')
self._update_expected_scopes_with_user_id(expected_scopes, grant_type)
assert 'expires_in' in data
assert data['expires_in'] > 0
assert data['token_type'] == 'JWT'
self.assert_valid_jwt_access_token(
data['access_token'],
self.user,
data['scope'].split(' '),
expected_scopes,
should_be_expired=False,
should_be_asymmetric_key=True,
should_be_restricted=True,
grant_type='password'
grant_type=grant_type
)

def test_restricted_access_token(self):
Expand Down Expand Up @@ -322,6 +330,7 @@ def test_jwt_access_token_scopes_and_filters(self, grant_type):
response = self._post_request(self.user, dot_app, token_type='jwt', scope=scopes)
assert response.status_code == 200
data = json.loads(response.content.decode('utf-8'))
self._update_expected_scopes_with_user_id(scopes, grant_type)
self.assert_valid_jwt_access_token(
data['access_token'],
self.user,
Expand Down Expand Up @@ -427,14 +436,20 @@ def _test_jwt_access_token(self, client_attr, token_type=None, headers=None, gra
assert data['expires_in'] > 0
assert data['token_type'] == 'JWT'

expected_scopes = data['scope'].split(' ')
self._update_expected_scopes_with_user_id(expected_scopes, grant_type)
self.assert_valid_jwt_access_token(
data['access_token'],
self.user,
data['scope'].split(' '),
expected_scopes,
grant_type=grant_type,
should_be_asymmetric_key=asymmetric_jwt
)

def _update_expected_scopes_with_user_id(self, expected_scopes, grant_type):
if grant_type == 'password' and 'user_id' not in expected_scopes:
expected_scopes.append('user_id')

@ddt.data('dot_app')
def test_access_token_exchange_calls_dispatched_view(self, client_attr):
client = getattr(self, client_attr)
Expand All @@ -451,11 +466,14 @@ def test_jwt_access_token_exchange_calls_dispatched_view(self, client_attr):
response = self._post_request(self.user, client, token_type='jwt')
assert response.status_code == 200
data = json.loads(response.content.decode('utf-8'))
grant_type = 'password'
expected_scopes = data['scope'].split(' ')
self._update_expected_scopes_with_user_id(expected_scopes, grant_type)
self.assert_valid_jwt_access_token(
data['access_token'],
self.user,
data['scope'].split(' '),
grant_type='password'
expected_scopes,
grant_type=grant_type
)

assert 'expires_in' in data
Expand All @@ -470,11 +488,14 @@ def test_asymmetric_jwt_access_token_exchange_calls_dispatched_view(self, client
response = self._post_request(self.user, client, token_type='jwt', asymmetric_jwt=True)
assert response.status_code == 200
data = json.loads(response.content.decode('utf-8'))
grant_type = 'password'
expected_scopes = data['scope'].split(' ')
self._update_expected_scopes_with_user_id(expected_scopes, grant_type)
self.assert_valid_jwt_access_token(
data['access_token'],
self.user,
data['scope'].split(' '),
grant_type='password',
expected_scopes,
grant_type=grant_type,
should_be_asymmetric_key=True
)

Expand Down
Loading