Skip to content

Pilot Data based File Management

drelu edited this page Dec 22, 2012 · 12 revisions

Data Compute Dependency Management

# start compute unit
compute_unit_description = {
        "executable": "/bin/cat",
        "arguments": ["test.txt"],
        "number_of_processes": 1,
        "output": "stdout.txt",
        "error": "stderr.txt",   
        "input_data" : [data_unit.get_url()], # this stages the content of the data unit to the working directory of the compute unit
        "output_data": [
                        {
                         data_unit.get_url(): 
                         ["std*"]
                        }
                       ],  
        "affinity_datacenter_label": "eu-de-south",              
        "affinity_machine_label": "mymachine-1" 
}    

Input Staging

Output Staging

The process here is (i) to create a Pilot-Data at the location where you want to move these files to; (ii) then create an empty Data-Unit and bind it do a Pilot-Data. A Data-Unit is a logical container for a set of data; while a Pilot-Data is a physical store for a set of DUs. That means that you can simply create another DU in the Pilot-Data where your input DU resides.