Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Assigned Location Column to Application Status Report #34744

Merged
merged 20 commits into from
Jul 3, 2024

Conversation

zandre-eng
Copy link
Contributor

@zandre-eng zandre-eng commented Jun 10, 2024

Product Description

The Application Status Report includes a new "Assigned Location(s)" column which will show (up to a max of 4) assigned locations for each user:
Screenshot from 2024-06-12 11-03-57

If a user has more than 4 assigned locations, a "...See More" link becomes available in the column which allows the user to display all the assigned locations for a user:
show-all

Technical Summary

Link to ticket here.

New functions have been added to the ApplicationStatusReport class for formatting assigned locations for a user in an HTML format. This has been done so that the primary location of the user can be shown in bold.

A new JS file has been added which is simply there to handle toggling the "...See More" button on the front-end.

Feature Flag

None

Safety Assurance

Safety story

  • Tested locally
  • Have tested on staging
  • QA done

Automated test coverage

None

QA Plan

QA ticket has been done. Available here.

Rollback instructions

  • This PR can be reverted after deploy with no further considerations

Labels & Review

  • Risk label is set correctly
  • The set of people pinged as reviewers is appropriate for the level of risk of the change

@zandre-eng zandre-eng added awaiting QA QA in progress. Do not merge product/all-users-all-environments Change impacts all users on all environments labels Jun 10, 2024
@zandre-eng zandre-eng marked this pull request as ready for review June 10, 2024 09:29
@mkangia
Copy link
Contributor

mkangia commented Jun 11, 2024

Hey @zandre-eng

Just a quick UI check,

I see we have used the latest implementation of showing more records on HQ, probably from data forwarding records.
Just sharing this old implementation of the same that might more sense here since location lists is not expected to have that much content, for it to show in a new row under the current row.
I am unable to locate this in code but I do see how it used to look before in data forwarding records in this PR's description. Here when one clicks on the 3 dots, the full content would just populate in the same place itself. I believe that would make more sense for locations here.

@zandre-eng
Copy link
Contributor Author

I see we have used the latest implementation of showing more records on HQ, probably from data forwarding records.
Just sharing this old implementation of the same that might more sense here since location lists is not expected to have that much content, for it to show in a new row under the current row.
I am unable to locate this in code but I do see how it used to look before in data forwarding records in #30952 PR's description. Here when one clicks on the 3 dots, the full content would just populate in the same place itself. I believe that would make more sense for locations here.

Hey @mkangia, thanks for sharing. I've tested this out locally and this is how it would look like:

show-all

The GIF is a bit small, but you can try clicking on it to expand it and see the change better.

Do you think it would be worth incorporating both methods? I'm just trying to ensure that we do consider edge cases as rare as they might be. Let's say a user has a ridiculous amount of assigned cases (e.g. 20). In this instance we can show the button, but for anything less we simply show '...' if it's more than 4 locations. What are your thoughts?

@mkangia
Copy link
Contributor

mkangia commented Jun 11, 2024

Beautiful, I like this.

I think we should just add this. And this is also what I had in mind when we were discussing this in grooming.
feel free to consult with Mary as well.

Copy link
Contributor

@ajeety4 ajeety4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. +1 to the proposed UI change discussed between you and Manish.

@zandre-eng zandre-eng force-pushed the ze/add-loc-col-app-status-report branch from 44f35e9 to 30a284a Compare June 12, 2024 08:49
@zandre-eng zandre-eng marked this pull request as draft June 12, 2024 08:49
@zandre-eng zandre-eng marked this pull request as ready for review June 12, 2024 09:01
@zandre-eng
Copy link
Contributor Author

Given that the above discussion to use a link instead of a button would require restructuring most of the implementation, I have decided to do a rebase to clean up the PR a bit and then push commits for new changes.

FYI @mkangia @ajeety4

@zandre-eng zandre-eng requested a review from ajeety4 June 12, 2024 09:06
@zandre-eng zandre-eng force-pushed the ze/add-loc-col-app-status-report branch 2 times, most recently from 9ebed54 to c229c7a Compare June 12, 2024 09:29
@zandre-eng
Copy link
Contributor Author

@mkangia @ajeety4 Please let me know if the PR looks good from your end. If so, I can get started on pushing out the change comms for this implementation.

Copy link
Contributor

@ajeety4 ajeety4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice refactor 🥇

Copy link
Contributor

@mkangia mkangia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the follow up and redirection @zandre-eng

I think we need to consider a couple of things to make this performant.

if not assigned_loc_ids:
return '---'
primary_loc_id = user.get('location_id')
return self._get_formatted_assigned_location_names(primary_loc_id, assigned_loc_ids)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you could make it simpler by passing the user to the function itself.
Fetching assigned_loc_ids and primary_loc_id isn't really needed in this function

def get_location_column(self, user):
    if not user.get('assigned_location_ids'):
        return '---'
    return self._get_formatted_assigned_location_names(user)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

locs = SQLLocation.objects.filter(location_id__in=assigned_location_ids)
formatted_loc_names = []
for loc in locs:
if loc.location_id == primary_location_id and len(assigned_location_ids) > 1:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we simply bold the location if its primary irrespective of the length of assigned locations?
It would be confusing if help_text says that primary location is in bold and then for one row its not in bold because its the only location.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I thought it might be unnecessary to highlight the location if there is only one, but you make a good point that we should rather keep this consistent given what the help text says. I'll remove the length check from the conditional.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

corehq/apps/reports/standard/deployments.py Outdated Show resolved Hide resolved
Create an HTML formatted string of the given assigned location names.
The primary location will be highlighted in bold.
"""
locs = SQLLocation.objects.filter(location_id__in=assigned_location_ids)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need some sort of caching here. If reports loads say 100 users at once, we would be firing up 100 queries which could lead to bad performance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add a quickcache here, however we will still need to do the initial 100 queries.

I'm thinking just before looping through users here, we could do an initial loop through the users to retrieve all their assigned location IDs and map this to a dict. This would look something like this:

@memoized
def user_locations_dict(self, users):
    all_loc_ids = []
    for user in users:
        all_loc_ids += user.get('assigned_location_ids')
    all_locs = SQLLocation.objects.filter(location_id__in=all_loc_ids).values_list('location_id', 'name')
    return {
        loc_id: loc_name
        for loc_id, loc_name in all_locs
    }

The above causes us to do a few extra loops, however we only need to do a single DB call to get all the location names. This dict can then be passed to the func responsible for parsing the HTML. @mkangia do you think this is a good approach or should using a quickcache be sufficient enough?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems good to me. One db query sounds good.

You could use a set instead of list to avoid repeated ids.

Also I learnt this recently from Ethan

return dict(SQLLocation.objects.filter(
    location_id__in=all_loc_ids
).values_list('location_id', 'name'))

will give you the dict you need.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the code, this is very useful. I've implemented the necessary changes so that we only do a single DB hit to get all user locations for a page (7ebeb72). Unfortunately I couldn't add memoized or quickcache due to the fact that we are passing a list of users to the func (which is unhashable).

@@ -317,7 +317,7 @@ def user_locations_dict(self, users):
for loc_id in user.get('assigned_location_ids'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is the populated on the user though? Can you confirm if this is actually available only for mobile users? Would be good to test this locally for a web user as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an available field on the CouchUser model (code ref), so it's available for both mobile and web users. But looking through the code, it seems this field only gets used for CommCareUser (blank for web users). For web users, their locations are being added here under a domain membership instance.

Good catch, I'll need to make a code change to accommodate this. To make things easier, we could have a helper function to correctly wrap the user doc. In code, this might look something like:

def correctly_wrap_user(self, user_doc):
    from corehq.apps.users.models import CouchUser, WebUser, CommCareUser
    couch_user = CouchUser.wrap(user_doc)
    if couch_user.is_web_user():
        return WebUser.wrap(user_doc)
    return CommCareUser.wrap(user_doc)

Once we have the wrapped user, we can simply call get_location_ids() for assigned locations or get_location_id() for the primary location on either mobile or web users where we need the location IDs.

@mkangia does the above implementation sound fine to you?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI there's also a CouchUser.wrap_correctly() that does what you want here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's perfect, thanks Ethan.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see we are getting users from elasticsearch here.
I am still not sure what value assigned_location_ids would have for a web user. I think if you can check an ES doc for a web user and for a mobile user, it would be clear how we can fetch location ids just for the relevant domain.
Limiting to domain in sql query seems like a good step to add as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should be able to get it from the unwrapped docs too, though if web users are in the mix, you'll have to pull it from the correct domain membership for them

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @esoergel for your inputs here.

It does seem better to use wrap_correctly then. I thought about this a bit more and I prefer using existing code to do this correctly rather than fiddling around with elasticsearch doc unless that is already done somewhere (which I could not find in HQ anywhere).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside: The way we do location assignment as a field on the couch user is a frequent source of frustration (especially since we store it in different places between web and mobile users). One of these days I'd like to move it to SQL, something like:

class HqUser(models.Model):
    django_user = models.OneToOneField(User)
    locations = models.ManyToManyField(
        SQLLocation, through=`LocationAssignment`, related_name='users')

class LocationAssignment(models.Model):
    domain = models.CharField()
    hq_user = models.ForeignKey(HqUser)
    location = models.ForeignKey(SQLLocation)
    is_primary = models.BooleanField()

Then we could finally work with users and locations in bulk using joins

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @esoergel

That would be the SQL way of doing it.

I do think however if,

  1. primary location id could be saved on user itself? is that information duplication? possibly.
  2. in order to look up locations of a certain set of users, we are going to join three tables. I wonder if in terms of SQL that would hinder the performance considering that is quite often.

I don't know if keeping a list column that has list of location ids on the HQUser would be a bad idea.
Though that does making fetching locations for users in bulk impossible.

So,
For a single user we would still fetch by ID though for bulk we would do a join. Seems good to me.

That is going to be some migration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is going to be some migration.

😬 ain't that the truth

Locations to users are many-to-many, and I think that's the standard way to do it in Django (the through model is optional, but I'm pretty sure Django makes one anyways if you don't specify it). Primary location is many-to-one, so that could just be a simple foreign key if we need it. That's a good point about the location_id - unfortunately SQLLocation has both an integer primary key (aka id) and a GUID location_id, and the pk is what's stored on related models by default. Both are indexed though, so I bet there's some way to do the join on the location ID. Then you wouldn't need to hit the location table if all you need is the location_id.

Copy link
Contributor

@mkangia mkangia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated approval to confirm the accessible locations check
#34744 (comment)

</div>
''').format(formatted_str, eplise_str, all_str)
''').format(formatted_str, all_str, view_all_str, view_less_str)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this code will be easier to follow if we add additional html elements later. And we are just adding empty nodes which can be avoided.
something like:

html_nodes = ['<span class="locations-list">{formatted_str}</span>']
if len(formatted_loc_names) > 4:
    html_nodes.append[
        '<span class="all-locations-list" style="display:none">{all_str}</span>',
        '...'
    ]
return format_html(f'<div>{''.join(html_nodes)}</div>')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this approach, a bit cleaner. I've done some refactoring in (ee88a55).

$(function () {
$('#report-content').on('click', '.toggle-all-locations', function (e) {
$(this).prevAll('.locations-list').toggle();
$(this).children('span').toggle();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would be nice to limit this by adding a specific class to this span to avoid conflicting with any other span getting added to the div in future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@mkangia mkangia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the patience on review on this one @zandre-eng

Considering Ethan's inputs it does seem that its better to use wrap_correctly and let HQ tell us assigned locations safely instead of figuring things out from ES.
I do think there are performance improvements that you could consider with calling wrap_correctly 2 times for each user. I am not sure how difficult it would be to do it only once within how the code is designed.

@@ -307,6 +312,16 @@ def include_location_data(self):
)
)

def user_locations_dict(self, user_docs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: a docstring would be useful here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd suggest naming this to locations_names_dict instead considering what its returning and its not really bound to users at all other than just getting the location IDs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. Addressed in ba975bf

@@ -316,6 +331,7 @@ def process_rows(self, users, fmt_for_export=False):
grouped_ancestor_locs = self.get_bulk_ancestors(location_ids)
self.required_loc_columns = self.get_location_columns(grouped_ancestor_locs)

user_loc_dict = self.user_locations_dict(users)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: and then changing this variables name as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all_str = ', '.join(formatted_loc_names)
view_controls_html_nodes = [
f'<span class="loc-view-control">{_("...See more")}</span>',
f'<span class="loc-view-control" style="display:none">{_("...See less")}</span>',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I am not sure I have seen "See less" on such widgets before. Is "Collapse" a better word?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I have updated the text in b045a3d

@zandre-eng
Copy link
Contributor Author

Considering Ethan's inputs it does seem that its better to use wrap_correctly and let HQ tell us assigned locations safely instead of figuring things out from ES.
I do think there are performance improvements that you could consider with calling wrap_correctly 2 times for each user. I am not sure how difficult it would be to do it only once within how the code is designed.

Looking through the code, I only really see two ways in which we can avoid having to call wrap_correctly() twice:

  1. Add in a loop to wrap all the user docs and store that in a list for later use.
  2. When wrapping users in the locations dict function, we could save this to a list and return all the wrapped users along with the locations dict.

I'm inclined to go for option 2 as it means we can save on doing another loop, however it will stretch the purpose of the function a bit. @mkangia What are your thoughts?

@zandre-eng zandre-eng added QA Passed and removed awaiting QA QA in progress. Do not merge labels Jul 1, 2024
@mkangia
Copy link
Contributor

mkangia commented Jul 1, 2024

I'm inclined to go for option 2 as it means we can save on doing another loop, however it will stretch the purpose of the function a bit. @mkangia What are your thoughts?

Feel free to go with any approach that you feel is better. Its for 10 users at once by default or max 100. So if making these changes makes the code less good, it would be okay to leave it as it is as well.

@zandre-eng
Copy link
Contributor Author

Feel free to go with any approach that you feel is better. Its for 10 users at once by default or max 100. So if making these changes makes the code less good, it would be okay to leave it as it is as well.

That is a good point about the page size. Given the small set of data we'll be working with, I'll keep the implementation as is. Going for either option (1) or (2) will involve keeping a dict and passing it to various functions which feels like unnecessary additional complexity. Testing this out locally, the wrap takes about 0.0005 seconds to process, so having two wraps shouldn't add any noticable delays to the report.

@zandre-eng zandre-eng requested review from ajeety4 and mkangia July 1, 2024 10:58
Copy link
Contributor

@mkangia mkangia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the follow up @zandre-eng

@@ -307,6 +312,23 @@ def include_location_data(self):
)
)

def locations_names_dict(self, user_docs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: just hit me that we have user docs in two places, couch and ES. Would be nice if this was clear which one this is either by the name of the parameter or by the documentation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this could be a private function? and could ideally be after process_rows to keep readable order of functions (reference: Newspaper metaphor from Clean Code)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I can see how this can make reading the code a little more difficult. I've done a tiny refactor in d6a20e8

result[0][1] = table
return result

def get_location_column(self, user_doc, user_loc_dict):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this could be a private function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in d6a20e8

@zandre-eng zandre-eng requested a review from mkangia July 2, 2024 07:44
@@ -507,8 +507,8 @@ def _fmt_timestamp(timestamp):
result[0][1] = table
return result

def get_location_column(self, user_doc, user_loc_dict):
user = CouchUser.wrap_correctly(user_doc)
def _get_location_column(self, user_doc_es, user_loc_dict):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: user_es_doc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in e5c2342. I also renamed user_doc to user_es_doc in the loop to make it more explicit.

@zandre-eng zandre-eng merged commit e743ec4 into master Jul 3, 2024
13 checks passed
@zandre-eng zandre-eng deleted the ze/add-loc-col-app-status-report branch July 3, 2024 07:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
product/all-users-all-environments Change impacts all users on all environments QA Passed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants