diff --git a/README.md b/README.md index 35db8fc..75201e8 100644 --- a/README.md +++ b/README.md @@ -34,10 +34,14 @@ source venv/bin/activate # Install Python dependencies python3 -m pip install 'py-orca[all]' 'metaflow' 'pyyaml' 's3fs' +``` -# Run the script using an example dataset. -# Ensure that the S3 prefix contains a bucket your tower workspace has access to. -python3 demo.py run --dataset_id 'syn51514585' --s3_prefix 's3://orca-service-test-project-tower-bucket/outputs' +Before running the example below, ensure that the `s3_prefix` points to an S3 bucket your Nextflow `dev` +or `prod` tower workspace has access to. In the example below, we will point to the `example-dev-project-tower-scratch` S3 bucket because we will be launching our workflows within the +`example-dev-project` workspace in `tower-dev`. +```bash +# Run the script using an example dataset +python3 demo.py run --dataset_id 'syn51514585' --s3_prefix 's3://example-dev-project-tower-scratch/work' ``` The above dataset ID ([`syn51514585`](https://www.synapse.org/#!Synapse:syn51514585)) refers to the following YAML file, which should be accessible to Sage employees. Similarly, the samplesheet ID below ([`syn51514475`](https://www.synapse.org/#!Synapse:syn51514475)) should also be accessible to Sage employees. However, there is no secure way to make the output folder accessible to Sage employees, so the `synindex` step will fail if you attempt to run this script using the example dataset ID. This should be sufficient to get a feel for using `py-orca`, but feel free to create your own dataset YAML file on Synapse with an output folder that you own.