Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New view for Data Dict JSON to support chunk load #34787

Merged
merged 21 commits into from
Jul 3, 2024

Conversation

ajeety4
Copy link
Contributor

@ajeety4 ajeety4 commented Jun 18, 2024

Product Description

Technical Summary

This is one of the sub tasks that creates a new django view with support to load case properties in a chunked fashion using skip and limit (defaults to 500).
If no case type is provided, the view returns all case types with their respective properties count.
With case type parameter, the view returns properties for that case type based on skip and limit query params.

See Technical Spec for more details.

Once the related changes are completed, the current urls for fetching case types and case properties will be migrated to this new view and old view would be removed.

JIRA Ticket and see linked tickets for related tasks as part of chunk loading.
The related tasks are basically updating the javascript model and template to support chunk loading.

Feature Flag

N/A. Available on all plans with Data Dictionary.

Safety Assurance

Safety story

Creates a new view that is not yet used, so changes are extremely safe.
Local testing done for new view

Automated test coverage

New test cases added

QA Plan

None

Rollback instructions

  • This PR can be reverted after deploy with no further considerations

Labels & Review

  • Risk label is set correctly
  • The set of people pinged as reviewers is appropriate for the level of risk of the change

Copy link

sentry-io bot commented Jun 18, 2024

🔍 Existing Issues For Review

Your pull request is modifying functions with the following pre-existing issues:

📄 File: corehq/apps/data_dictionary/util.py

Function Unhandled Issue
get_used_props_by_case_type ESError: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host='10.202.40.159', port=9200... ...
Event Count: 24
📄 File: corehq/apps/data_dictionary/views.py (Click to Expand)
Function Unhandled Issue
data_dictionary_json ESError: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host='10.202.41.10', port=9200)... ...
Event Count: 9
---

Did you find this useful? React with a 👍 or 👎

@ajeety4 ajeety4 force-pushed the ay/data-dict-chunk-load branch from da777c5 to bac90de Compare June 18, 2024 16:52
@ajeety4 ajeety4 marked this pull request as ready for review June 18, 2024 16:59
@ajeety4 ajeety4 added the product/feature-flag Change will only affect users who have a specific feature flag enabled label Jun 18, 2024
Copy link
Contributor

@zandre-eng zandre-eng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, just a few comments regarding the view.

corehq/apps/data_dictionary/views.py Show resolved Hide resolved
"properties_count": case_type.properties_count,
}

if case_type_name:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This function is quite long. Could we move the code related to fetching data for a single case type into its own separate function? This could even be split up into two separate views.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was in favour of this (moving into a different view) however went against it seeing some amount of common code. I was keen to see what others say during review.
I am going to split them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for splitting function

Copy link
Contributor

@kaapstorm kaapstorm Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I would also advocate splitting. I would make a few changes:

  1. Split load_fhir_resource_mappings() into two functions: One for case types and one for case properties.
  2. Pull _get_case_data() out. Move into it some of the code that would become duplicated. Maybe something like:
     def get_case_type_data(domain, case_type):
         case_type_app_module_count = get_case_type_app_module_count(domain)
         used_props_by_case_type = get_used_props_by_case_type(domain)
         module_count = case_type_app_module_count.get(case_type.name, 0)
         used_props = used_props_by_case_type.get(case_type.name, [])
         data = {
             "name": case_type.name,
             "fhir_resource_type": None,
             "is_deprecated": case_type.is_deprecated,
             "module_count": module_count,
             "is_safe_to_delete": len(used_props) == 0,
             "properties_count": case_type.properties_count,
         }
         if toggles.FHIR_INTEGRATION.enabled(domain):
             resource_type_name_by_case_type = get_fhir_resource_type_map(domain)
             data["fhir_resource_type"] = resource_type_name_by_case_type.get(case_type)
         return data
    
  3. Split the view at the else: into two views: One if the case type is given, and one to list the case types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 384263b

corehq/apps/data_dictionary/views.py Show resolved Hide resolved
queryset = CaseType.objects.filter(domain=domain).annotate(properties_count=Count('property'))
if not request.GET.get("load_deprecated_case_types", False) == "true":
queryset = queryset.filter(is_deprecated=False)
case_types_data = [_get_case_data(case_type) for case_type in queryset]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the full case type JSON when just retrieving the list of case types and their property counts? I imagine properties like is_safe_to_delete and module_count we only need when loading the full set of data for a case type (since this is when a user will be able to delete the case type).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point. I went with the expected response in the tech spec but I agree on this. I can check on this and get rid of things that we do not need.

Copy link
Contributor Author

@ajeety4 ajeety4 Jun 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After thinking/looking more on this, I think logically all the case type related data including is_safe_to_delete and module_count should be part of the Case Type API.
Instead, we should consider not returning these case type details when we use the case properties endpoint - just return case type name, _links, properties_count and their grouped properties groups.
Does that makes sense ? @zandre-eng @kaapstorm

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with what you've done in f91eaf5 -- I'm feeling quite pragmatic about this. The viewmodel is the only thing that will be consuming this API, and as long as it has what it needs before it needs it, I think that works.

from corehq import privileges
from corehq.util.test_utils import flag_enabled
# TODO Remove this once we migrate to the new view
urlpatterns.insert(0, url(r"^json_v2/$", data_dictionary_json_v2, name='data_dictionary_json_v2'))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of dynamically inserting these URL patterns for the tests, would it make sense to rather just add these to the URLs file? They already have a different URL path to the current DD JSON endpoint, and we can always switch them out when we remove the old views.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that can be done. I only was not in favour of adding new urls , however considering we are going to remove them in short time, I am good to add these in the urls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 384263b

})
case_type_data["groups"].append(group_data)

return JsonResponse(case_type_data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the geo_case_property not also be returned here? After the case property data has been loaded we use the property to identify which ones are used in the Geospatial feature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. Although not very sure on that, I think this was probably not required as Norman has completed the changes on the JS side.

Comment on lines 194 to 203
current_url = request.build_absolute_uri()
links = {"self": update_url_query_params(current_url, {"skip": skip, "limit": limit})}
if skip:
links["previous"] = update_url_query_params(
current_url,
{"skip": max(skip - limit, 0), "limit": limit}
)
if case_type_data["properties_count"] > (skip + limit):
links["next"] = update_url_query_params(current_url, {"skip": skip + limit, "limit": limit})
case_type_data["_links"] = links
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also feels like it could be split out into a separate method, maybe something like update_url_params or so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 1339a9d

Copy link
Contributor

@mkangia mkangia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wanted to suggest a slightly different approach to the PR

  1. refactor the current data_dictionary_json to extract out pieces, specifically things that will stay the same
  2. copy exactly the same view into a v2
  3. modify the bits that need to change

This would make it easier to identify

  1. what has stayed the same or what code is not new
  2. what has changed and is also relevant for review

As in the current state, I believe the PR is really hard to read and review.

P.S: You can ignore this if a lot has changed from the previous view.

corehq/apps/data_dictionary/urls.py Show resolved Hide resolved
@ajeety4
Copy link
Contributor Author

ajeety4 commented Jun 24, 2024

Just wanted to suggest a slightly different approach to the PR

  1. refactor the current data_dictionary_json to extract out pieces, specifically things that will stay the same
  2. copy exactly the same view into a v2
  3. modify the bits that need to change

This would make it easier to identify

  1. what has stayed the same or what code is not new
  2. what has changed and is also relevant for review

As in the current state, I believe the PR is really hard to read and review.

P.S: You can ignore this if a lot has changed from the previous view.

Good suggestion in terms of making the review better.
However as you mentioned, there are some changes to the functionality/refactoring and looking at additional review comments, I am going to keep as it is.

@mkangia
Copy link
Contributor

mkangia commented Jun 24, 2024

However as you mentioned, there are some changes to the functionality/refactoring and looking at additional review comments, I am going to keep as it is.

Hey @ajeety4
I would still request if you could reconsider how this is currently done. Even if not the approach I suggested, I'd recommend going through the commits yourself and see what changes would make this easier to review. That would not only make review easier but also make it possible to clearly understand the approach and reveal changes from how data dictionary worked previously, and if any, highlight concerns. As of now, this all looks like a new feature altogether.

Though if others are happy to review the way it is currently, I am not going to block on this.

@ajeety4
Copy link
Contributor Author

ajeety4 commented Jun 25, 2024

However as you mentioned, there are some changes to the functionality/refactoring and looking at additional review comments, I am going to keep as it is.

Hey @ajeety4 I would still request if you could reconsider how this is currently done. Even if not the approach I suggested, I'd recommend going through the commits yourself and see what changes would make this easier to review. That would not only make review easier but also make it possible to clearly understand the approach and reveal changes from how data dictionary worked previously, and if any, highlight concerns. As of now, this all looks like a new feature altogether.

Though if others are happy to review the way it is currently, I am not going to block on this.

Hi @mkangia ,
I agree with the notion that this does looks like a new feature. Looking at the points highlighted, I feel like it is worth cleaning up.
I intend to create a old view and make changes on top of that so they are clear.
@Charl1996 @kaapstorm @zandre-eng , just wanted to check that it is okay to go ahead with the new commits ?

Copy link
Contributor

@kaapstorm kaapstorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure whether all the comments have been resolved, but I'm happy with this.

@ajeety4 ajeety4 force-pushed the ay/data-dict-chunk-load branch from f91eaf5 to a97f285 Compare June 27, 2024 15:58
@ajeety4
Copy link
Contributor Author

ajeety4 commented Jun 27, 2024

However as you mentioned, there are some changes to the functionality/refactoring and looking at additional review comments, I am going to keep as it is.

Hey @ajeety4 I would still request if you could reconsider how this is currently done. Even if not the approach I suggested, I'd recommend going through the commits yourself and see what changes would make this easier to review. That would not only make review easier but also make it possible to clearly understand the approach and reveal changes from how data dictionary worked previously, and if any, highlight concerns. As of now, this all looks like a new feature altogether.
Though if others are happy to review the way it is currently, I am not going to block on this.

Hi @mkangia , I agree with the notion that this does looks like a new feature. Looking at the points highlighted, I feel like it is worth cleaning up. I intend to create a old view and make changes on top of that so they are clear. @Charl1996 @kaapstorm @zandre-eng , just wanted to check that it is okay to go ahead with the new commits ?

Hello Team, as conveyed , I have made the commit changes and force pushed it. So the first commit creates a copy of existing view and have made changes on top of that which makes it better to understand.
The end result aka code is same though.
@kaapstorm you might need to do a rebase.

Apologies for any inconvenience.

@mkangia
Copy link
Contributor

mkangia commented Jun 28, 2024

Hey @ajeety4

Just checking that this should be reviewed from the first commit again?

@ajeety4
Copy link
Contributor Author

ajeety4 commented Jul 1, 2024

Hey @ajeety4

Just checking that this should be reviewed from the first commit again?

Hey @mkangia , that is correct !

@kaapstorm
Copy link
Contributor

kaapstorm commented Jul 1, 2024

btw, this branch is currently deployed to Staging @ajeety4 @mkangia and is being tested by QA as part of testing data dictionary changes.

  • Rebased branch: ay+ze+nh/dd_chunked
  • QA ticket: QA-6727

Comment on lines 213 to 217
properties_queryset = CaseProperty.objects.select_related('group').filter(case_type=case_type)
properties_queryset = properties_queryset.order_by('group_id', 'index', 'pk')[skip:skip + limit]
properties_queryset = properties_queryset.prefetch_related(
Prefetch('allowed_values', queryset=CasePropertyAllowedValue.objects.order_by('allowed_value'))
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps very subjective, but I find blocks like this to be more readable when they are inside parentheses instead of reassigned over several lines. e.g.

Suggested change
properties_queryset = CaseProperty.objects.select_related('group').filter(case_type=case_type)
properties_queryset = properties_queryset.order_by('group_id', 'index', 'pk')[skip:skip + limit]
properties_queryset = properties_queryset.prefetch_related(
Prefetch('allowed_values', queryset=CasePropertyAllowedValue.objects.order_by('allowed_value'))
)
properties_queryset = (
CaseProperty.objects
.select_related('group')
.filter(case_type=case_type)
.order_by('group_id', 'index', 'pk')[skip:skip + limit]
.prefetch_related(Prefetch(
'allowed_values',
queryset=CasePropertyAllowedValue.objects.order_by('allowed_value')
))
)

I'm okay with leaving this as it is though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I too feel this is more readable. Thanks Norman. Addressed in 606be7c

@ajeety4 ajeety4 force-pushed the ay/data-dict-chunk-load branch 2 times, most recently from 2cd5ea9 to a97f285 Compare July 3, 2024 07:00
Copy link
Contributor

@zandre-eng zandre-eng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good from my side, great work!

@zandre-eng zandre-eng added the product/invisible Change has no end-user visible impact label Jul 3, 2024
@ajeety4 ajeety4 added product/all-users-all-environments Change impacts all users on all environments and removed product/feature-flag Change will only affect users who have a specific feature flag enabled labels Jul 3, 2024
@ajeety4 ajeety4 merged commit 8232c9d into master Jul 3, 2024
13 checks passed
@ajeety4 ajeety4 deleted the ay/data-dict-chunk-load branch July 3, 2024 12:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
product/all-users-all-environments Change impacts all users on all environments product/invisible Change has no end-user visible impact
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants