Skip to content

Commit

Permalink
feat: Remove lockfiles when running Exporter and Flattener tasks, all…
Browse files Browse the repository at this point in the history
…owing tasks to restart, closes #354
  • Loading branch information
jpmckinney committed May 8, 2024
1 parent ec39087 commit 1e67eda
Show file tree
Hide file tree
Showing 3 changed files with 27 additions and 3 deletions.
10 changes: 8 additions & 2 deletions data_registry/process_manager/task/exporter.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,18 @@
class Exporter(TaskManager):
final_output = True

def get_export(self):
return Export(self.job.id, basename="full.jsonl.gz")

def run(self):
self.get_export().unlock()

publish({"job_id": self.job.id, "collection_id": self.job.context["process_id_pelican"]}, "exporter_init")

def get_status(self):
status = Export(self.job.id, basename="full.jsonl.gz").status
return exporter_status_to_task_status(status)
export = self.get_export()

return exporter_status_to_task_status(export.status)

@skip_if_not_started
def wipe(self):
Expand Down
4 changes: 4 additions & 0 deletions data_registry/process_manager/task/flattener.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,10 @@ def get_exports(self):
yield Export(self.job.id, basename=f"{path.name[:-9]}.csv.tar.gz") # remove .jsonl.gz

def run(self):
for export in self.get_exports():
if export.running:
export.unlock()

publish({"job_id": self.job.id}, "flattener_init")

def get_status(self):
Expand Down
16 changes: 15 additions & 1 deletion docs/admin/siteadmin.rst
Original file line number Diff line number Diff line change
Expand Up @@ -109,9 +109,23 @@ A job can stall (always "running"). The only option is to `cancel <https://scrap
Restart a task
~~~~~~~~~~~~~~

You can restart the :ref:`Exporter<cli-exporter>` and :ref:`Flattener<cli-flattener>` tasks. Do this only if the ``data_registry_production_exporter_init`` and ``data_registry_production_flattener_init`` queues are empty in the `RabbitMQ management interface <https://rabbitmq.data.open-contracting.org/>`__.

.. note::

The Flattener task publishes one message per file. You might receive a Sentry notification about a failed conversion, while other conversions are still enqueued or in-progress.

The Exporter task publishes one message per job. This task *can* be restarted while the queue is non-empty – as long as another administrator has not restarted it independently.

#. `Access the job <https://data.open-contracting.org/admin/data_registry/job/>`__
#. Set only the *Exporter* and/or *Flattener* task's *Status* to *PLANNED*
#. Click *SAVE*

Any lockfiles are deleted to allow the task to run.

.. attention::

To properly implement this feature, see `#354 <https://github.com/open-contracting/data-registry/issues/354>`__ (for retryable tasks) and `#350 <https://github.com/open-contracting/data-registry/issues/350>`__ (for non-retryable tasks).
See `#350 <https://github.com/open-contracting/data-registry/issues/350>`__.

Unpublish or freeze a publication
---------------------------------
Expand Down

0 comments on commit 1e67eda

Please sign in to comment.