Skip to content

How to Create a Pipeline in Theia for Researchers to Run

Chelsea Troy edited this page May 29, 2022 · 1 revision

Creation

  1. (if running locally) run the app with pipenv run server. If you've spun up the app with docker-compose, then the app is already running.
  2. To log in, go to HOME (http://localhost:8080) and log in with Panoptes.
  3. Create a project (ID must match the ID of the associated project in panoptes, name can be anything)
  4. Create a a pipeline
    • name can be anything,
    • workflow is optional BUT
    • multiple subject sets will create a subject set for each scene, which all contain about 1,000 tiles and the workflow id is used somehow in this
    • IMPORTANT!!! DO NOT SET the subject set ID
    • Set the correct project
  5. Create Pipeline Stages
    • sort order is the order this stage will be run in
    • output format is for overriding (for example, to .png if the output image is something else)
    • operation is a name (see keys in operations/init.py)
    • cannot supply selected images or config (have to create these through the console or the admin UI)
    • set the correct pipeline (the one you just made)
    • The final pipeline stage should be an "upload to panoptes" pipeline stage for the purposes of adding subject sets to panoptes.
    • each pipeline stage has config, which can be determined from what keys are referenced by self.config in the operation
      • for example, remapping images takes select_images, which refers to an array of strings from the set specified in the adapter like "green" or "blue"
      • Some config variables (those that or lists or json blobs) must be set through the console or the Admin UI rather than through the API UI. These include:
        • selected images on the pipeline stage
        • config on the pipeline stage

Example of setting these variables in the console:

$ pipenv run console
 >>> p = PipelineStage.objects.get(pk=2)
>>> p
<PipelineStage: PipelineStage object (2)>
>>> p.config = {}
>>> p.save()
>>> p = PipelineStage.objects.get(pk=2)
>>> p.config
{}

Testing

  1. Create Imagery Request (this is what a client researcher would do) adapter name: usgs for now (valid values in adapters/init.py) dataset name: find in thea adapter.py file for the type of adapter you used, the keys of DATASET_LOOKUP max cloud cover: Maximum percentage of image that can have clouds over. example, 30 max results: number of images to return. Example, 5 wgs row: the WRS row (we need to rename this), example 23 wgs path: the WRS path (we need to rename this), example 31
  2. At this point, jobs have been queued for the celery workers to pick up. If you've spun up the app with docker-compose, then the workers are already running. If you're running the app locally, then you can locally run the celery worker with pipenv run worker.

Eventually, new subject sets will show up in the project on panoptes.

Clone this wiki locally