Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve search and display capabilities for identities in the Flagsmith UI #4016

Open
matthewelwell opened this issue May 24, 2024 · 6 comments · Fixed by #4620
Open

Improve search and display capabilities for identities in the Flagsmith UI #4016

matthewelwell opened this issue May 24, 2024 · 6 comments · Fixed by #4620
Assignees
Labels
api Issue related to the REST API front-end Issue related to the React Front End Dashboard improvement Improvement to the existing platform

Comments

@matthewelwell
Copy link
Contributor

Currently, due to the large quantities of data involved in identity storage, and the way in which that data is stored in our SaaS platform to support the Edge API, searching and displaying additional data about identities can be very difficult.

Some of the main problems are:

  1. It is not possible to search on another other than the identifier. This is problematic, particularly when introducing non-engineering users to Flagsmith since the identifier is often a unique key such as a uuid or similar which most users will not have access to.
  2. Similar to the above, it is not possible to see at a quick glance from the list of identities which identities are which because we only show the identifier.
  3. We do not show the total number of identities (only applicable to SaaS).

Note that this issue combines both #444 and #290.

@matthewelwell
Copy link
Contributor Author

matthewelwell commented May 24, 2024

The key issue described above is (1). There are a few options that we can investigate here for a solution:

1. Add an alias function to our SDKs which will add a new, indexed, parameter to our identities which can then be searched across.

We would implement something like:

flagsmith.alias(identifier="<uuid>", alias="matthew.elwell")

This could get stored against the identity and displayed alongside the identifier in the list, and the search could search across both the identifier and the alias.

Pros:

  • Easiest to implement from an API perspective.
  • Would also (at least partly) resolve (2) above
  • Likely wouldn't require any additional infrastructure

Cons:

  • Not very flexible (e.g. doesn't allow users to define multiple values to search across)
  • Would require changes in all SDKs
  • Likely requires the creation of another index on our identities table in dynamodb which will increase costs

Note that as a temporary measure here, we could allow users to add an alias via the admin API, which would mean that customers could either do this from the dashboard, so that once an identity has been found once via their identifier, they could be aliased and found easier next time. Or, they could iterate over their identities via the management API and alias the identities.

2. Create a search index (where?) based on the traits for each identity

We could create a search index that looks something like:

trait_key_1:trait_value_1;trait_key_2;trait_value_2...

Then in the search field (or a separate search input), we could add the option to choose a trait to search by and then build the search query to do a full text search across this field building the query to look something like trait_key:trait_value to avoid hitting multiple trait keys that have similar values for example.

Note that we may want to have people define the traits that they want to be able to search on, rather than building the search index for all traits for all identities which might get unmanageably large.

Pros:

  • Most flexible / user friendly solution
  • No changes required to SDKs
  • This could also help us with being able to select trait keys from a list when e.g. creating segments

Cons:

  • Difficult to implement
  • Likely requires some additional infrastructure somewhere (e.g. elasticsearch?)
  • Building this search index for existing identities will be costly

@matthewelwell
Copy link
Contributor Author

I think for SaaS (more specifically the Edge API), we'd want to look into using DynamoDB streams to trigger a lambda which will update a new model in Django which we can use to search across to get the results, before hitting dynamodb.

This will be a significant undertaking, however, probably a few weeks of work and testing, plus we would also need to work out how to migrate the data into the postgres models in the first place.

For self hosted, we could probably add this functionality quite easily by just directly searching across the traits as the data for a self hosted install would not be as large as for our SaaS environment.

@matthewelwell matthewelwell added improvement Improvement to the existing platform front-end Issue related to the React Front End Dashboard api Issue related to the REST API labels Jun 11, 2024
@matthewelwell matthewelwell moved this to Planned 2024 in Flagsmith Roadmap Jun 20, 2024
@novakzaballa
Copy link
Contributor

I think for SaaS (more specifically the Edge API), we'd want to look into using DynamoDB streams to trigger a lambda which will update a new model in Django which we can use to search across to get the results, before hitting dynamodb.

I love this idea for self-hosted, I can remember it was also suggested by @dabeeeenster to handle identity overrides in local evaluation. For SaaS, I recommend using a cheaper and more efficient solution for large data sets. This type of use case is ideal for a Data-Lake/Data-Warehouse solution. As I suggested several times we could:

  • Use AWS S3 as our data lake store because it would be cheap.
  • Store our data as parquet files which are plain text files (usually zippeed) organized in a columnar way to optimize access to large datasets.
  • Use any Apache Spark compatible Big Data DBMS to consume/query the data like Amazon Athena, Amazon Redshift, or any other.

That will allow the customers to make queries like:

  • Search identities by any trait value
  • List Identities of a segment
  • Export their data so they can analyze it with DatawareHouse tools like Tableu or data series analyzers like InFLuxDB

Another advantage is that in the future, we could offer data analysis ourselves if we want.

This would allow us to store all/any historical and non-operational data here and access it by any criteria, we could create materialized views for the most used access patterns, so we can allow our customers to access/analyze their information in any way. but that is out of the scope of this particular issue.

@matthewelwell matthewelwell moved this from Planned 2024 to Planned Q3 2024 in Flagsmith Roadmap Jul 31, 2024
@matthewelwell
Copy link
Contributor Author

I've begun investigating this a little further. I have made a start on some PoC code for option 2 in my comment above (here). See the WIP PR here.

Some important notes:

  1. When using dynamo streams and global replication, it is sufficient to connect to a stream in a single region, all replicated writes will also trigger the stream.
  2. I've done some basic maths on the AWS pricing (although don't quote me on it) and it doesn't look like it will be expensive. See additional calculations / notes here.

Questions to answer:

  1. What service will actually consume the DDB stream? Probably Lambda? But then how do we eventually get the data into postgres? RDS proxy? An endpoint in the core API to queue a task?

@matthewelwell
Copy link
Contributor Author

@kyle-ssg In this PR #4569 I have added a new field to the edge identities called "dashboard_alias".

From a FE perspective we need to 1:

  1. Display it on the detail view of an identity
  2. Allow an option to update it via the detail view of an identity
  3. Add functionality to search by dashboard_alias (by simply searching for dashboard_alias:<alias>
  4. Maybe tidy up my bad implementation of the dashboard alias in the list view?

@kyle-ssg kyle-ssg linked a pull request Sep 18, 2024 that will close this issue
4 tasks
@kyle-ssg kyle-ssg mentioned this issue Sep 18, 2024
4 tasks
@github-project-automation github-project-automation bot moved this from In Progress to Done 2023 in Flagsmith Roadmap Oct 2, 2024
@matthewelwell matthewelwell reopened this Nov 6, 2024
@matthewelwell
Copy link
Contributor Author

Re-opening to track being able to search by trait.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Issue related to the REST API front-end Issue related to the React Front End Dashboard improvement Improvement to the existing platform
Projects
Status: Done 2023
Development

Successfully merging a pull request may close this issue.

3 participants