-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add scope user_id to JWT payload #33455
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR should be closed. Instead, you’ll need to add the user_id scope for the oauth client in Django admin.
Thanks for the review @robrap .
This view does not add further scopes in the token but sets the default scopes provided here: edx-platform/lms/envs/common.py Line 1226 in 7c25c5f
I believe by oauth client in Django admin. you mean the ApplicationAccess model in the edx_platform where we can specify scopes for Applications/Clients. I believe in the context of this API, this config will only allow us to filter the already added scopes and not insert new scopes in the payload. This results in all live users using the Please let me know if I missed something, Thanks again! |
@moeez96: Thank you for helping me understand the situation better. When external orgs request a token on behalf of one of their users, we explicitly were avoiding sending our user_id with the PII. That said, I couldn’t find the ADR or discussion around all of this, and it seems the design didn’t account for your current use case. This will require some research to ensure that the user id is only provided to the actual user who has the user id (or who could easily access the user id by other means). |
|
|
I wish I could be more clear on where we should not be adding user_id, because maybe there would be a simpler solution for this overall. That said, I think your solution sounds great. With password grant type, add This would be a great time to capture an ADR regarding the past decision (as best as we can) and this updated decision as well. I am happy to help with that. It doesn't need to be perfect, but anything would be better than the current situation where none of this is documented. |
644f291
to
ebbe087
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. I made a bunch of proposals for the ADR. Let me know what you think.
openedx/core/djangoapps/oauth_dispatch/docs/decisions/0015-add-scope-user-id-for-jwt.rst
Outdated
Show resolved
Hide resolved
openedx/core/djangoapps/oauth_dispatch/docs/decisions/0015-add-scope-user-id-for-jwt.rst
Outdated
Show resolved
Hide resolved
openedx/core/djangoapps/oauth_dispatch/docs/decisions/0015-add-scope-user-id-for-jwt.rst
Outdated
Show resolved
Hide resolved
# The scope `user_id` must be added for requests with grant_type password. | ||
scopes = _update_user_id_in_scopes(scopes or ['email', 'profile'], grant_type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, looking more closely add the code, I think we should follow the same pattern of either using the explicitly defined scopes, or using the default scopes. Something like the following:
# Default scopes should only contain non-privileged data.
# Do not be misled by the fact that `email` and `profile` are default scopes. They
# were included for legacy compatibility, even though they contain privileged data.
if grant_type == Application.GRANT_PASSWORD:
default_scopes = ['user_id', 'email', 'profile']
else
default_scopes = ['email', 'profile']
scopes = scopes or default_scopes
All of this could be packaged into a private method if you wish, but it is more consistent (i.e. won't override explicitly set scopes) and keeps all the scopes code together. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JWT consumes scopes from the access_token payload. By default, scopes will almost never be empty since the default scopes are always added to the access_token (in case of missing explicitly defined scopes).
edx-platform/lms/envs/common.py
Line 1226 in 7c25c5f
OAUTH2_DEFAULT_SCOPES = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- [Nit] Thanks for the clarification. This is just an opinion, but I'd prefer seeing
_update_user_id_in_scopes
renamed to_get_updated_scopes
(or some name like that), and packaging all the scope related code (default + user_id) in one place. If you feel strongly otherwise, no need to update. - I can't think of a clean and simply way to create a new appropriate setting related to user_id, but wondering if we could at least add a comment before
edx-platform/lms/envs/common.py
Line 1242 in 7c25c5f
'user_id': _('Know your user identifier'),
# user_id is added in code as a default scope for JWT cookies and all password grant_type JWTs
expected_scopes = data['scope'].split(' ') | ||
self._update_expected_scopes_with_user_id(expected_scopes, grant_type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need help understanding how this isn't just covering up a problem. Shouldn't data
which comes from the response contain all the necessary scopes? Something feels fishy about this fix, so maybe you can help explain it a bit better. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are just the expected scopes.
The scopes from the response are actually stored in the body of the decoded access token itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a bug:
- The scopes in the response used to match the scopes in the JWT, but now they don't.
- I think the fix would be that this comment should be removed, and that we now need to set the scopes in the response to the scopes in the JWT.
Let me know your thoughts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there may be a misunderstanding here:
- In the tests, Actual scopes are derived by decoding the token here.
- What we are editing in the tests are the expected scopes, since we added the logic to add the
user_id
scope togrant_type
password
requests. - In the tests, we are not manipulating the scopes returned in the response.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you call the API from the command line using a password grant type, is the user_id in the scopes in the dict of the response? This dict should be in sync with the scopes in the JWT, and I’m guessing it’s not, but I could be wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand your point now, the user_id scope is not added in the response to the API. Updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UPDATED: I'm glad we are on the same page about the API response scopes matching the JWT scopes, but I'm still confused about all the test changes. Shouldn't the test just show that these match, and if they don't match, why don't they?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing out, reverted related tests changes.
# The scope `user_id` must be added for requests with grant_type password. | ||
scopes = _update_user_id_in_scopes(scopes or ['email', 'profile'], grant_type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- [Nit] Thanks for the clarification. This is just an opinion, but I'd prefer seeing
_update_user_id_in_scopes
renamed to_get_updated_scopes
(or some name like that), and packaging all the scope related code (default + user_id) in one place. If you feel strongly otherwise, no need to update. - I can't think of a clean and simply way to create a new appropriate setting related to user_id, but wondering if we could at least add a comment before
edx-platform/lms/envs/common.py
Line 1242 in 7c25c5f
'user_id': _('Know your user identifier'),
# user_id is added in code as a default scope for JWT cookies and all password grant_type JWTs
@moeez96: Additionally, could we follow up this PR with a small edit to the Authentication ADR to mention/link to this new ADR from some appropriate place? |
@robrap Can you confirm which Authentication ADR you are referring to here? |
The newly added ADR in this PR. |
I understand we aded a new ADR in this PR. Where do you suggest we mention/link this ADR to? In other words, which file are you suggesting a change in? |
Potentially as a sub-bullet of the JWT Authentication bullet here? https://open-edx-proposals.readthedocs.io/en/latest/best-practices/oep-0042-bp-authentication.html#oauth2-and-jwts. Note that this is not a blocker for this PR, but just an idea to make it easier to learn about this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about all the comments, but this code is brittle and needs to be secure, so I just want to ensure it works as expected. There is still something about the test code that doesn't add up for me.
@@ -79,10 +80,11 @@ def create_jwt_token_dict(token_dict, oauth_adapter, use_asymmetric_key=None): | |||
# .. custom_attribute_name: create_jwt_grant_type | |||
# .. custom_attribute_description: The grant type of the newly created JWT. | |||
set_custom_attribute('create_jwt_grant_type', grant_type) | |||
scopes = token_dict['scope'].split(' ') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like this should work similarly to jwt_expires_in
, where it is processed once here and used twice below, in the JWT and the response dict.
Note: You may want to leave the old scopes = scopes or ['email', 'profile']
as it was in _create_jwt
just in case, but otherwise, this will make it more clear that these two values should be matching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I’d do this in one statement, but it is just a personal preference, so you can decide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Altered
jwt_token_dict = token_dict.copy() | ||
# Note: only "scope" is not overwritten at this point. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is true that we are now overwriting all response values, we no longer need to copy the dict. We could just have:
jwt_token_dict = {
...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refresh_token
is still not overwritten in token_dict
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some additional thoughts:
- Can you restore the comment, and mention the refresh_token?
- I’m wondering if it would be best if the refresh_token had the extra scope as well, and if we didn’t need to add it here, because it would have been added earlier?
- If we called the api with the three password grant scopes explicitly (as a test from the command line), does it work or complain about the user_id request? If it complains, does that indicate where the scope update should happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Comment restored.
- The scope is only added in the JWT, it is not added in the access token or refresh token since it's not required there in this use case.
- If we call the API with payload having
scopes
set to grant types email, profile and user_id, it works in the same fashion and does not error out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think #2 would be more accurate/consistent, because the refresh token would match the actual scopes.
That said, I agree that this shouldn't be a blocker, so I added a potential code comment just to capture what is going on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor improvements noted. Feel free to merge once they are complete. Thank you!
@@ -91,11 +93,12 @@ def create_jwt_token_dict(token_dict, oauth_adapter, use_asymmetric_key=None): | |||
) | |||
|
|||
jwt_token_dict = token_dict.copy() | |||
# Note: only "scope" is not overwritten at this point. | |||
# Note: only "refresh_token" is not overwritten at this point. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a possible code comment trail if we decide to ignore #2 from this comment: https://github.com/openedx/edx-platform/pull/33455/files#r1372861173
# Note: only "refresh_token" is not overwritten at this point. | |
# Note: only "refresh_token" is not overwritten at this point. | |
# At this time, the user_id scope added for grant type password is only added to the | |
# JWT, and is not added for the DOT access token or refresh token, so we must override | |
# here. If this inconsistency becomes an issue, then the user_id scope should be | |
# added earlier with the DOT tokens, and we would no longer need to override "scope". |
jwt_token_dict = token_dict.copy() | ||
# Note: only "scope" is not overwritten at this point. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think #2 would be more accurate/consistent, because the refresh token would match the actual scopes.
That said, I agree that this shouldn't be a blocker, so I added a potential code comment just to capture what is going on.
@@ -325,7 +325,7 @@ def test_jwt_access_token_scopes_and_filters(self, grant_type): | |||
self.assert_valid_jwt_access_token( | |||
data['access_token'], | |||
self.user, | |||
scopes, | |||
data['scope'].split(' '), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want an additional assertion against the original scopes, to ensure that all 3 match:
- The scopes sent to the request.
- The scopes in the response.
- The scopes in the JWT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The scopes in the response and the scopes in the JWT will match.
The scopes sent to the request will be a subset of the scopes in the response in this case, since we add user_id in the response scopes on purpose in case of grant_type=password
.
Assertion added.
2U Release Notice: This PR has been deployed to the edX staging environment in preparation for a release to production. |
2U Release Notice: This PR has been deployed to the edX production environment. |
1 similar comment
2U Release Notice: This PR has been deployed to the edX production environment. |
Description
Ecommerce system uses the JWT payload attribute
user_id
to populate the fieldlms_user_id
in the Ecommerce User table.The JWT payload is built on edx-platform and sent to the Ecommerce system. Currently the attribute user_id is missing from this payload. This PR adds the missing attribute user_id to the JWT payload.
Supporting information
Jira ticket: https://2u-internal.atlassian.net/browse/LEARNER-9640
Related ADR: https://github.com/openedx/ecommerce/blob/master/docs/decisions/0004-unique-identifier-for-users.rst
Related Ecommerce system attribute mapping: https://github.com/openedx/ecommerce/blob/f0e196fe4371f56a14969a1fb0d2fed79c39630a/ecommerce/settings/base.py#L654
Testing instructions
{{ecommerce_domain}}/api/iap/v1/basket/add/?sku={{sku}} GET
payload
has theuser_id
included in it.Deadline
None