Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/3490 apostrophes in titles #2360

Open
wants to merge 32 commits into
base: develop
Choose a base branch
from

Conversation

RK206
Copy link
Contributor

@RK206 RK206 commented Mar 11, 2024

ascii folding for special characters

Please don't delete any sections when completing this PR template; instead enter N/A for checkboxes or sections which are not applicable, unless otherwise stated below

See https://github.com/DOAJ/doajPM/issues/3490

Describe the scope/purpose of the PR here in as much detail as you like

This PR is to do ascii folding to enable search for some special characters.
Updated Article model to Seamless
Updated Seamless couple of places
This has to be thoroughly tested as there are mapping changes and index need to be re-indexed.

Categorisation

This PR...

  • has scripts to run
  • has migrations to run
  • adds new infrastructure
  • changes the CI pipeline
  • affects the public site
  • affects the editorial area
  • affects the publisher area
  • affects the monitoring

Basic PR Checklist

Instructions for developers:

  • For each checklist item, if it is N/A to your PR check the N/A box
  • For each item that you have done and confirmed for yourself, check Developer box (including if you have checked the N/A box)

Instructions for reviewers:

  • For each checklist item that has been confirmed by the Developer, check the Reviewer box if you agree
  • For multiple reviewers, feel free to add your own checkbox with your github username next to it if that helps with review tracking

Code Style

  • No deprecated methods are used

    • N/A
    • Developer
    • Reviewer
  • No magic strings/numbers - all strings are in constants or messages files

    • N/A
    • Developer
    • Reviewer
  • ES queries are wrapped in a Query object rather than inlined in the code

    • N/A
    • Developer
    • Reviewer
  • Where possible our common library functions have been used (e.g. dates manipulated via dates)

    • N/A
    • Developer
    • Reviewer
  • Cleaned up commented out code, etc

    • N/A
    • Developer
    • Reviewer
  • Urls are constructed with url_for not hard-coded

    • N/A
    • Developer
    • Reviewer

Testing

  • Unit tests have been added/modified

    • N/A
    • Developer
    • Reviewer
  • Functional tests have been added/modified

    • N/A
    • Developer
    • Reviewer
  • Code has been run manually in development, and functional tests followed locally

    • N/A
    • Developer
    • Reviewer
  • Have CSS/style changes been implemented? If they are of a global scope (e.g. on base HTML elements) have the downstream impacts of the change in other areas of the system been considered?

    • N/A
    • Developer
    • Reviewer

Documentation

Release Readiness

  • If needed, migration has been created and tested locally
    Migration script to run
    python portality/scripts/es_reindex.py portality/migrate/3490_ascii_folding/migrate.json

    • N/A
    • Developer
    • Reviewer
  • Release sheet has been created, and completed as far as is possible https://docs.google.com/spreadsheets/d/1Bqx23J1MwXzjrmAygbqlU3YHxN1Wf7zkkRv14eTVLZQ/edit#gid=1426896557

    • N/A
    • Developer
    • Reviewer
  • There has been a recent merge up from develop (or other base branch). List the dates of the merges up from develop below

    • 2024-03-11
    • 2024-03-13
    • 2024-08-27

Testing

List the Functional Tests that must be run to confirm this feature

Public search ascii folded search

Deployment

What deployment considerations are there? (delete any sections you don't need)

Configuration changes

What configuration changes are included in this PR, and do we need to set specific values for production

Scripts

What scripts need to be run from the PR (e.g. if this is a report generating feature), and when (once, regularly, etc).

Migrations

What migrations need to be run to deploy this

Monitoring

What additional monitoring is required of the application as a result of this feature

New Infrastructure

What new infrastructure does this PR require (e.g. new services that need to run on the back-end).

Continuous Integration

What CI changes are required for this

@philipkcl philipkcl self-requested a review March 25, 2024 09:55
Copy link
Contributor

@philipkcl philipkcl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added some recommendation and questions, some unittest fails

.github/PULL_REQUEST_TEMPLATE.md Outdated Show resolved Hide resolved
doajtest/unit/test_query.py Outdated Show resolved Hide resolved
portality/lib/dataobj.py Outdated Show resolved Hide resolved
portality/models/v2/shared_structs.py Show resolved Hide resolved
portality/models/v1/shared_structs.py Outdated Show resolved Hide resolved
Copy link
Contributor

@richard-jones richard-jones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From our discussions and review today, we found that the feature itself works well, and gives us the results we are looking for. There are some code structure changes to make to finalise:

  1. Change the code that generates the mappings, such that the mapping data and the seamless object data are logically separated. The mappings for the analysable fields should be included in the same way as the custom mappings for notes. The additional changes to seamless and dataobj code should be rolled back
  2. We can remove the changes to the v1 versions of objects, they are legacy models and shouldn't be updated
  3. We need to run the unit tests over a migrated index and confirm that they are all successful

Once we have done (1) we need to confirm that the resulting mapping still applies the ".exact" mapping which is required for a lot of system functions.

While these changes are being made, this can also go to test for a review of the search behaviour

@RK206
Copy link
Contributor Author

RK206 commented Aug 27, 2024

Updated as discussed

Copy link
Contributor

@richard-jones richard-jones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of things that need to be rolled back to remove the additional_fields stuff completely, then this is good to move to the next stage

portality/lib/dataobj.py Outdated Show resolved Hide resolved
@@ -354,7 +354,7 @@ def get_single(self, path, coerce=None, default=None, allow_coerce_failure=True)
return val

def set_single(self, path, val, coerce=None, allow_coerce_failure=False, allowed_values=None, allowed_range=None,
allow_none=True, ignore_none=False, context=""):
allow_none=True, ignore_none=False, context="", additional_fields = None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RK206 need to roll back this change on seamless

return mapping_opts['coerces'][spec['coerce']]
field_mapping = deepcopy(mapping_opts['coerces'][spec['coerce']])
if 'additional_fields' in spec:
field_mapping = {**field_mapping, **mapping_opts[spec['additional_fields']]}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These new lines no longer needed now, so can be rolled back

@RK206
Copy link
Contributor Author

RK206 commented Aug 28, 2024

Sorry, missed to remove the 'additional_fields'. Now rolled back all those changes.

@Steven-Eardley
Copy link
Contributor

Should this change cover Applications as well?

@RK206
Copy link
Contributor Author

RK206 commented Nov 6, 2024

Applications are not covered in this code changes. Applications can be added if required.

@dommitchell
Copy link
Contributor

@RK206 if it isn't too much trouble then applications should be included.

@RK206
Copy link
Contributor Author

RK206 commented Nov 7, 2024

Included Application as well and committed the changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants