RFC #6: Refactoring Request Management System

Refactoring Request Management System

Authors: K.Ciba, A.Tsaregorodtsev

Last Modified: 11.11.2012

Introduction

The Request Management System (RMS) is designed for management of simple operations that are performed asynchronously on behalf of users - owners of the tasks. The RMS is used for multiple purposes: failure recovery ( failover system ), data management tasks and some others. It should be designed as an open system easily extendible for new types of tasks.

Basic architecture and functionality

The core of the the RMS system is a request, which holds the information about its creator (DN and group), status, various timestamps (creation time, submission time, last update), job ID to which request belongs to, DIRAC setup and request's name. One request can be made of several sub-requests of various types (i.e. transfer, removal, registration, logupload, diset) and the operations that have to be executed (i.e. replicateAndRegister, registerFile, removeReplica etc.) to process request, source and destination storage elements to use, their statuses, various timestamps, error messages if any and order of their execution. The sub-request itself depending of type and operation can reference several sub-request's files, which again are holding all required bits of informations (i.e. file lfn or pfn or both, it's checksum and size, its GUID, status, error message etc.).

All request information is kept in RequestDB database, which could use two kinds of back-ends: mysql (RequestDBMySQL) and local file system directory (RequestDBFile) through one common service (RequestManagerHandler) that could talk directly to the RequestDB allowing selection, insertion or update of particular request. All those CRUD operations are performed using specialised client interface (RequestClient).

The execution of requests is done by various specialised agents, each for one request type, i.e. TransferAgent which is processing transfer sub-requests, RemovalAgent for removal, RegistrationAgent of register sub-requests, DISETForwardingAgent for diset one and so on. The common pattern in agent code is to select sub-requests available for execution, perform some data manipulation to execute defined operation, update statuses in RequestDB and notify request's job when all sub-requests are done.

Updating the RMS architecture

While on database side request is kept in a three closely connected tables (RequestDB.Requests, RequestDB.SubRequests and RequestDB.Files), on the python client side there is only one class available: RequestContainer. This imbalance between SQL and python world leads to not clear, too heavy, error prone and not so easy to use API.

Updated Database schema

Operations, ...

Basic objects

Classes, properties, ... Status updates

State machine

Execution engine with pluggable modules for "operation executors"

RMS Monitoring

Command line tools

Web page

Separation of the RequestDB and TransferDB

Separation, moving TransferDB to DataManagement System, one to one correspondence of requests and FTS jobs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly