Property Boundaries Service

This service manages a database and an API to serve data from the land registry's INSPIRE polygon and company ownership data. The service also has a pipeline that, when run, updates the database with the latest land registry data and backs up old data.

Requirements

NodeJS
MySQL
GDAL tools (includes the ogr2ogr command line tool)
PM2

Installing and deploying

Run the install-remote.sh script from your local machine to install the application e.g.:
```
bash install-remote.sh -u aubergine [email protected]
```
Note that this will only succeed once you have uploaded the GitHub SSH deploy key (explained in the script's output).
Log into the server and, in the codebase, copy .env.example to .env.
Fill in .env with the credentials and API keys (in BitWarden or the password-store).
bash scripts/deploy.sh to run the DB migration scripts, build and serve the app with PM2

Useful dev commands

npx sequelize-cli db:migrate:undo:all && npx sequelize-cli db:migrate to reset database migrations
npm run dev:serve to start the server responding to API requests.
npm run build && npm run plot to plot some of the analysis after a pipeline has run

Ownerships + INSPIRE updates pipeline

All of the pipeline-related code is in the src/pipeline directory. The pipeline can be triggered by an API request like this:

https://<property boundaries service api url>/run-pipeline?secret=<secret>

or the pipeline can be started with additional options (see PipelineOptions for details), e.g.:

https://<property boundaries service api url>/run-pipeline?secret=<secret>&startAtTask=analyseInspire&updateBoundaries=true

Analysing the output

A pipeline always updates the Land Ownership data (this is the fast, easy bit of the pipeline). Once the updateOwnerships task is complete, the new data should be visible in LX for all users.

After the downloadOwnerships task, a LX super user can see the pending INSPIRE polygons that have been downloaded in a separate, secret data layer.

If the updateBoundaries param is true, the analyseInspire task will write the pending polygons that are accepted as a successful match into the DB, so that they're visible for all users. Further detailed output for the pipeline can be found in the analysis folder in the project's root folder. This output can help you manually investigate something further e.g. if you want to investigate a failed match:

find the details in failed-matches.json and copy a lat-lng of a vertex
login to LX as a super user (to become a super user, update your record in MySQL on the server)
enable the Pending Polygons data layer
search for the lat-lng in the LX search bar, then click on nearby properties to visualise the polygon(s) involved

TODO (roughly in priority order)

Step back and think about what we want the pipeline to achieve, and what info we want to try to save as boundaries gradually change. Prioritise and try to narrow the scope. And think about whether we need any other data sources to achieve this.
Add analytics and do profiling to find where the bottlenecks are in analysis script, so they can be optimised. The script takes far too long currently - around 30 mins per council. Tasks could maybe be parallelised using worker threads (see https://nodejs.org/api/worker_threads.html). Also decide which bits of the algorithm are most needed and remove some computation that isn't necessary. And allow pipelines to resume automatically if something goes wrong e.g. the server reboots, which is fairly likely since the pipeline is going to take a long time even if we optimise it really well (it's processing a huge amount of data!)
Fully spec the behaviour of the pipeline, in particular the matching algorithm for INSPIRE polygons, then add unit tests to match this spec
- Mocha is set up for this. I made a methods.test.ts file, using GitHub Copilot, inspired by the Land Explorer backend
- Add these tests to a Github CI pipeline, like on LX backend
- We'll need to modularise some of the long functions in the pipeline a bit more (e.g. the comparePolygons function) to make unit testing easier
- Maybe we need to plot some different polygon scenarios that can be visualised and used for different edge cases.
Address the various 'TODO' comments around the codebase
Add some docs to /docs to give a high-level overview of what the pipeline is doing. But wherever possible, especially for low-level details, prefer Mocha specs over written documentation. Docs can be ignored but specs with unit tests can't.
Improve how the results of the analysis can be understood/visualised. It's currently a lot of data, and it's hard to know which matches to check individually.
Enable strict Typescript checking in tsconfig.json and fixup existing checking failures. Use more modern Sequelize definitions so that we get types https://sequelize.org/docs/v7/models/defining-models/

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
.vscode		.vscode
config		config
docs		docs
migrations		migrations
scripts		scripts
src		src
.env.example		.env.example
.env.test		.env.test
.gitignore		.gitignore
.mocharc.json		.mocharc.json
package-lock.json		package-lock.json
package.json		package.json
readme.md		readme.md
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Property Boundaries Service

Requirements

Installing and deploying

Useful dev commands

Ownerships + INSPIRE updates pipeline

Analysing the output

TODO (roughly in priority order)

About

Releases

Packages

Contributors 2

Languages

DigitalCommons/property-boundaries-service

Folders and files

Latest commit

History

Repository files navigation

Property Boundaries Service

Requirements

Installing and deploying

Useful dev commands

Ownerships + INSPIRE updates pipeline

Analysing the output

TODO (roughly in priority order)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages