From 646d858c7968072b6757f06908344a38ef730ba0 Mon Sep 17 00:00:00 2001 From: Justin Hiemstra Date: Tue, 9 Apr 2024 20:10:22 +0000 Subject: [PATCH 1/4] Update origin documentation --- docs/pages/serving_an_origin.mdx | 280 ++++++++++++++++++++++++++----- 1 file changed, 234 insertions(+), 46 deletions(-) diff --git a/docs/pages/serving_an_origin.mdx b/docs/pages/serving_an_origin.mdx index b50292b12..964b080d9 100644 --- a/docs/pages/serving_an_origin.mdx +++ b/docs/pages/serving_an_origin.mdx @@ -2,9 +2,11 @@ import ExportedImage from "next-image-export-optimizer"; # Serve a Pelican Origin -The [Pelican](http://pelicanplatform.org/) *Origin* connects your data to a Pelican data federation to allow data sharing. It acts like an adapter plug to your data store, which means it does **NOT** hold any data itself. Rather, it takes your storage backend, such as a POSIX file system or S3 buckets, does the dirty work of communicating with a Pelican federation, and exposes your data to federation memebers. You will have fine-grain control of how your data can be accessed with Pelican. +Pelican users who want to share data within a Pelican federation do so via an *Origin*. Origins are a crucial component of Pelican's architecture for two reasons: they act as an adapter between various storage backends and Pelican federations, and they provide fine-grained access controls for that data. That is, they figure out how to take data from wherever it lives (such as a POSIX filesystem, S3 buckets, HTTPS servers, etc.) and transform it into a format that the federation can utilize while respecting your data access requirements. -This document contains instructions on how to serve a Pelican origin. +> An important distinction between origins and data backends is that, generally speaking, origins do **NOT** store any data themselves; their primary function is to facilitate data accessibility. + +This document contains instructions on how to serve a Pelican origin on top of a variety of storage backend types. ## Before Starting @@ -15,27 +17,27 @@ If you haven't installed Pelican, follow the instructions to [install pelican](/ For _Linux_ users, it is recommended to install Pelican using one of the package managers (RPM, APK, Deb, etc.) so that Pelican dependecies are automatically handled. You may also run a [Pelican docker image](./install/docker.mdx) to serve a Pelican origin. If you prefer to install Pelican as a standalone binary, you need to follow [additional instructions](https://osg-htc.org/docs/data/xrootd/install-standalone/#install-xrootd-standalone) to install dependencies for the Pelican origin server. -> Note that serving a Pelican origin with a standalone Pelican binary is possible, but not recommended or supported. +> **NOTE:** Serving origins with a standalone Pelican binary is possible, but not recommended. For _macOS_ and _Windows_ users who want to serve a Pelican origin, please use [Pelican docker image](./install/docker.mdx). ### Open Firewall Port for Pelican Origin -Pelican origin server listens to two TCP ports for file transfers and Web UI. By default, the file transfer port is at `8443` and the Web UI and APIs port is at `8444`. If your server has firewall policy in place, please open the two ports for both incoming the outgoing TCP requests to allow Pelican origin functions as expected. +At their core, Pelican origins are web servers that listen to two TCP ports for file transfers and Web UI. By default, the browser/API interface for your origin will be at port `8444`, and the port for object transfers will be at `8443`. You may change these port numbers through the [configuration file](./parameters.mdx) with parameters [`Origin.Port`](./parameters.mdx#Origin-Port) and [`Server.WebPort`](./parameters.mdx#Server-WebPort) respectively. -You may change the port numbers through the [configuration file](./parameters.mdx) with parameter [`Origin.Port`](./parameters.mdx#Origin-Port) and [`Server.WebPort`](./parameters.mdx#Server-WebPort) respectively. +In order for Pelican origins to work properly, these ports need to be accessible by the federation, which in most cases means they need to be open to the internet. If your server host has a firewall policy in place, please open these two ports for both incoming the outgoing TCP requests. -> If it is not possible for you to expose any port through the firewall, Pelican has a special feature called _Connection Broker_, where it allows you to serve a Pelican origin without a public-accessible port and any TLS credential files in place. However, this is an experimental feature and requires the Pelican federation you are joining to be compatible. If you are interested in learning more about _Connection Broker_, please contact help@pelicanplatform.org for further instructions. +> **NOTE:** If it is not possible for you to expose any ports through the firewall (eg you're on a local network or behind a NAT), Pelican has a special feature called a _Connection Broker_ that allows you to serve origins without publicly-accessible ports and TLS credentials. However, this is an experimental feature and requires the Pelican federation you are joining to be compatible. If you are interested in learning more about the Connection Broker, please contact help@pelicanplatform.org for further instructions. ### Prepare TLS Credentials -Pelican servers use `https` for serving its web UI and handling internal http requests. `https` requires a set of credential files in place to work, including: +Data transfers in Pelican rely on HTTPS, the encryption scheme used by everyone from banks to instagram that's responsible for securely transmitting data between internet-connected computers. To configure the origin with HTTPS, you'll first need to acquire three things: -- A valid TLS certificate +- A valid Transport Layer Security (TLS) certificate - The private key associated with the certificate - The Intermediate Certificate or the chain file, that establishes the trust chain to a root certificate -> For local development and testing, you may skip setting up TLS credentials by setting configuration parameter `TLSSkipVerify` to `true`. You should **NOT** set this for production. +> **NOTE:** For local development and testing, you may skip setting up TLS credentials by setting configuration parameter `TLSSkipVerify` to `true`. You should **NOT** set this for production, as it makes all data, including your passwords, available to anyone who can monitor your network. You need to contact a Certificate Authority (CA) who owns the root certificate for getting these credentials. One popular CA that provides free TLS certificates is [Let's Encrypt](https://letsencrypt.org/). You may follow [their guide](https://letsencrypt.org/getting-started/) to obtain the credentials listed above. **Note that you need to have a valid domain before proceeding.** @@ -55,47 +57,138 @@ Once you go through the process, locate your credential files and set the follow Since your TLS certificate is associated with your domain name, you will need to change the default hostname of Pelican server to be consistent. Set `Server.Hostname` to your domain name (e.g. `example.com`). -## Serve an Origin +## Serving Origins + +When you've completed the aforementioned steps, you're ready to start configuring the origin that will add your data to a federation. Serving an origin is the process of taking some underlying storage repository and making its data accessible via a namespace prefix in your federation. For example, you might make files in the directory `/my/directory` available at the federation path `/my/namespace` so that anyone with access to the federation can get objects from the directory + +By default, Pelican origins serve files from a POSIX backend, the filesystem used by Linux computers. However, Pelican aims to support a variety of backends and we currently also support serving objects from S3. Configuration for S3 is mostly similar to configuration for POSIX filesystems, but with a few important differences. For information about S3 backends, jump to our [s3 documentation](#launch-the-origin-with-an-s3-storage-backend) > If you are running Pelican docker image to serve an origin, please refer to the [docker image documentation](./install/docker.mdx#run-pelican-origin-server). ### Find a federation to join -Before serving an origin, you need to find a Pelican federation to join in. If you are unfamiliar with the term **federation**, refer to [Useful Terminology](./client-usage.mdx#useful-terminology) before proceeding. +Before serving an origin, you need to decide which [**federation**](./client-usage.mdx#useful-terminology) your data will be accessed through. For example, the Open Science Data Federation (OSDF) is Pelican's flagship federation, and if you are interested in serving an OSDF origin, you can refer to the [OSDF website](https://osg-htc.org/services/osdf.html) for details about how to join. + +Federations are identified by a URL, which is used to host information your origin will need in order to discover other federation services. For example, the OSDF's federation URL is `https://osg-htc.org`, and an origin that joins the OSDF will visit `https://osg-htc.org/.well-known/pelican-configuration` to get important metadata about the federation's central services (the Director and Registry). -If you don't have a federation in mind, the Open Science Data Federation (OSDF) is an example Pelican federation that you can join in for testing purposes. If you are interesting in serving an OSDF origin, refer to the [OSDF website](https://osg-htc.org/services/osdf.html) for details. +To point your origin at a specific federation, you can either pass the `-f ` flag if running from the command line, or configure `Federation.DiscoveryUrl: ` in your config yaml. -The federation discovery URL for OSDF is `osg-htc.org`. You may use this as your `` argument in the next section when launching your origin. +### Serve the Origin -### Launch the Origin +Origins can be configured via the command line, via a config file named `pelican.yaml`, via environment variables, or through a combinations of the three. While simple origins can be run entirely from command line arguments, more complex origins will require configuration your your `pelican.yaml`. -To launch a pelican origin, run: +To start a simple pelican origin from the command line that serves POSIX data, run: ```bash -pelican origin serve -f -v : +pelican origin serve -f -v : ``` Where: -* `` is the URL to the federation the origin will be joining in -* `` is the directory containing objects to be exported to the federation -* `` is the namespace at which files from `` will be made available in the federation. Note that a namespace prefix must start with `/` +* `` is the federation URL discussed above +* `` is the absolute path to the directory containing files you want to export as Pelican objects in your federation +* `` is the federation prefix at which files in `/path/to/data` will be accessed from in the federation. Note that federation prefixes follow POSIX path conventions, and they must begin with `/` to denote an absolute path. + +> **NOTE:** By default, origins require authorization tokens for object access. There's currently no way to serve a public origin using only the command line, but you can find more information about various access controls by looking at [origin capabilities](#origin-and-namespace-capabilities) below. + +To run the same origin using a `pelican.yaml` configuration file, save your configuration to `/etc/pelican/pelican.yaml` if you're running Pelican as root, and at `~/.config/pelican/pelican.yaml` if you're running as a non-root user. The command line origin from above could be configured accordingly: + +``` +# Tell Pelican which federation you're joining +Federation: + DiscoveryUrl: + +# Configure your Origin +Origin: + # POSIX is the default storage type for Pelican origins + # and can be omitted + StorageType: "posix" + + Exports: + - StoragePrefix: "/path/to/data" + FederationPrefix: "/your/federation/prefix" + # Explicitly state what capabilities you want this prefix to have + Capabilities: ["Reads", "Writes"] + +``` +and then simply run +```bash +pelican origin serve +``` +Pelican will read the config file and apply it to your origin. + + +Finally, origins can be configured to a limited extent with environment variables. In Pelican's environment variable model, configuration options are taken from `pelican.yaml`, flattened, and prepended with either `PELICAN_` or `OSDF_`, depending on the name of the binary you're using (ie whether you run `osdf serve` or `pelican serve` commands). -This will start Pelican origin as a daemon process. +For example, you might configure the origin's storage type by setting the environment variable `PELICAN_ORIGIN_STORAGETYPE=posix`. + +> Environment variable configuration does not support the same complex structures that can be built with yaml configuration, such as lists. The first time the origin is started, you will see something that looks like the following: ```console -$ pelican origin serve -f osg-htc.org -v $PWD:/demo +$ pelican origin serve -f https://osg-htc.org -v $PWD:/demo Pelican admin interface is not initialized To initialize, login at https://localhost:8444/view/initialization/code/ with the following code: 551220 ``` +See the [browser configuration](#login-to-admin-website) documentation section for more information about initializing your origin's web interface. + +### Origin and Namespace Capabilities + +Capabilities are the configuration options you can assign to origins and namespace prefixes to determine what kinds of access controls you want them to respect. In the previous yaml configuration, we configured the origin capabilities using the `Capabilities` list of the `Origin.Exports` block by specifying that the federation prefix `/your/federation/prefix` supports "Reads" and "Writes". This list of capabilities can be used for further control of what types of operations the namespace is willing to support. Available capabilities include: + +- "Reads": When included, objects from the namespace can be read with a valid authorization token. +- "PublicReads": When set, objects from the namespace become public and require no authorization to read. +- "Writes": When included, objects can be written back to the storage backend by Pelican. Write operations _always_ require a valid authorization token. +- "DirectReads": When included, a namespace indicates that data being pulled from it should only be pulled through a cache and not directly by clients. This may be useful in cases where the origin isn't very performant or has to pay egress costs when data moves through it. Note that this is respected by federation central services, but may not be respected by all clients. +- "Listings": When included, the namespace indicates it will allow object discovery. Be careful when setting this for authorized namespaces, as this will allow anyone to discover the names of objects exported by this namespace. + +> **NOTE:** Most origins should have either "Reads" or "PublicReads" enabled. If neither is set, the origin won't export any data. + +There is an important distinction between _origin_ capabilities and _namespace_ capabilities. While it's sometimes easy to treat origins and namespaces as the same thing, Pelican must distinguish between them because two separate origins may export portions of the same namespace, and a single origin may export two disparate prefixes. The only exception to this rule is when a single origin serves a single namespace. + +To configure _origin_ capabilities, you can set top-level options for the origin: -### Additional arguments to launch the Origin +- `Origin.EnableReads`: When true, the origin supports reads that are accompanied by a valid authorization token. +- `Origin.EnablePublicReads`: When true, the origin supports reads by anyone without an authorization token. +- `Origin.EnableWrites`: When true, objects can be written back to the storage backend through the origin. Writes always require a valid authorization token. +- `Origin.EnableDirectReads`: When true, the origin indicates it's willing to serve clients directly, potentially without caching data. Note that this is respected by federation central services, but may not be respected by all clients. +- `Origin.EnableListings`: When true, the origin will allow object discovery. + +If no `Origin.Exports` block is provided to Pelican, these values will also be applied to your federation prefix. + +> **NOTE:** Pelican tries to resolve differences between origin and namespace configurations by respecting the more restrictive of the two. If you serve an origin that enables public reads, but the underlying prefix it exports disables all reads, you won't be able to read from that namespace. + +### Multi-Export Origins + +The previous examples have shown how one might export a single namespace, but Pelican origins can export multiple paths from the same storage backend under different namespaces. For example, you have have two posix directories called `/my/data/public` and `/my/data/private`. If you want to make your public data available under the namespace `/my/prefix/public` and your private data available under `/my/prefix/private`, you'll need to configure a multi-export origin, which is accomplished through the origin's `Exports` block. Below is an example of what that looks like, along with how you could configure access control for the two namespaces: + +``` +Federation: + DiscoveryUrl: https://my-federation.com + +Origin: + StorageType: posix + + # The actual namespaces we export + Exports: + - StoragePrefix: /my/data/public + FederationPrefix: /my/prefix/public + # Don't set Reads -- it should be toggled true by setting PublicReads + Capabilities: ["PublicReads", "Listings", "DirectReads"] + - StoragePrefix: /my/data/private + FederationPrefix: /my/prefix/private + # We set "Reads" but not "PublicReads" indicating we want authorization + Capabilities: ["Reads", "DirectReads"] +``` -This section documents the additional arguments you can pass to the command above to run the origin. +> **NOTE:** While multiple namespaces can be exported by the same origin, they must all have the same underlying storage type. That is, if the origin serves files from POSIX, it must only serve files from POSIX and not S3. + +### Additional Command Line Arguments for Origins + +This section documents additional arguments you can pass via the command line when serving origins. * **-h or --help**: Output documentation on the `serve` command and its arguments. * **-m or --mode**: Set the mode for the origin service ('posix'|'s3, default to 'posix'). @@ -103,18 +196,111 @@ This section documents the additional arguments you can pass to the command abov * **--writeable**: A boolean value to allow or disable writting to the origin (default is true). * **--config**: Set the location of the configuration file. -* **-d or --debug**: Enable the debugging mode, allowing for more verbose log -* **-l or --log**: Set the location of the file where log messages should be redirected to and not outputing the the console. +* **-d or --debug**: Enable debugging mode, which greatly increases the Pelican's logging verbosity +* **-l or --log**: Set the location of a file that will capture Pelican logs. Setting this will prevent logging output from printing to your terminal. + +For more information about available yaml configuration options, refer to the [Parameters page](./parameters.mdx). + +## Launch the Origin With an S3 Storage Backend + +### What is S3? + +S3, or "Simple Storage Service" is a type of object store introduced by Amazon Web Services (AWS) in 2006. Since then, the term S3 has grown to represent both the _service_ offered by Amazon as well as the _protocol_ used both by Amazon and many other providers who have no AWS affiliation. In general, Pelican works with any S3 provider and is not limited to what's offered by AWS. References to "S3" in Pelican documentation should be interpreted as "S3 the protocol." + +Unlike POSIX, which uses "files" organized into hierarchical directories with associated owners/permissions and a host of other metadata that are packaged together to act as a fundamental unit, S3 works with "objects" stored in "buckets". Typically, objects consist of data, metadata, and a unique identifier and they are stored in a flat address space referred to as a bucket. Because of this, there is no inherent hierarchy or nesting like there would be in a file system. One goal of Pelican is to obfuscate the underlying differences between storage backends like these so that users can enjoy a common interface for all there data, wherever it may happen to come from. -There are other configurations available to modify via the configuration file. Refer to the [Parameters page](./parameters.mdx) for details. +### Serving an S3 Origin -### Launch the Origin with an S3 storage backend +Serving S3 origins with Pelican is similar to serving POSIX origins, but with several key differences. The first is that Pelican must be configured to host an S3 backend, using the configuration option `Origin.StorageType = s3`. To make your work with S3, it needs to know at least four additional things: + +- The URL you use to access objects from S3, also known as the _S3 Service URL_ +- The _region_ that your S3 instance is hosted out of (almost always `us-east-1` unless you're actually using S3 from Amazon) +- The name of the bucket your objects are stored in +- The type of bucket hosting used at the S3 service URL, which can be either _path_ or _virtual_. This determines whether objects are normally accessed like `https:////` (path-style hosting) or `https://./` (virtual-style hosting), but does not change the way you access objects through Pelican. In many cases, it's safe to assume _path_-style hosting, and this is set to Pelican's default. For more information about different hosting styles in S3, see the [AWS documentation](https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html). + +> **NOTE:** Pelican has a special mode where no bucket information is provided that allows you to export objects from all public buckets at a given service URL. This is covered in further detail [later in this section](#exporting-an-entire-s3-endpoint). + +The service URL, region, and hosting style can be configured using the Pelican config variables `Origin.S3ServiceUrl`, `Origin.S3Region`, and `Origin.S3UrlStyle`. + +Additionally, some buckets might require credentials that prove you're allowed to access the objects they contain. In S3, these credentials are called the _access_ key and _secret_ key (In some cases the access key may also be referred to as the _API_ key). Essentially, they can be treated like a username and password, where the access/API key is your username and the secret key is your password. When a bucket you'd like to export requires authentication, you'll need to pass these values to Pelican by putting your keys in separate files and telling Pelican where those files can be found via either the `Origin.S3AccessKeyfile` variable or the `Origin.Exports.S3AccessKeyfile`. See below for examples of S3 origin configurations that use these values, along with an explanation of how to choose which one is right for you. + +### S3 Configuration Examples + +Origins can be configured with multiple exports by using the `Origin.Exports` block of your configuration: + +``` +Origin: + # Things that configure the origin itself + # Tell the origin it will be serving objects from S3 + StorageType: "s3" + S3ServiceUrl: "https://my-s3.com" + S3Region: "us-east-1" + S3UrlStyle: "virtual" + + # The actual namespaces we export. Each export is defined + # via its own export block + Exports: + - S3Bucket: "first-bucket" + FederationPrefix: /first/namespace + Capabilities: ["PublicReads", "Writes", "Listings", "DirectReads"] + - S3Bucket: "second-bucket" + S3AccessKeyfile: "/path/to/second/access.key" + S3SecretKeyfile: "/path/to/second/secret.key" + FederationPrefix: /second/namespace + # Notice we designate "Reads" and not "PublicReads" for this bucket + # because we assume that if the bucket requires credentials to access, + # the origin should, too. + Capabilities: ["Reads", "Writes"] +``` +In this example, the object `foo` from the bucket `first-bucket` would be accessible without any token authorization at the namespace path `/first/namespace/foo`. Getting the object `bar` from `second-bucket` would require a valid access token, and would be accessed via `/second/namespace/bar`. In this example, the actual bucket names hosting `foo` and `bar` are elided from a Pelican user's perspective, because they are accessed through the namespace. If you'd like make users aware of the underlying bucket name, you can use the bucket name as your `FederationPrefix`. + +Alternatively, if your origin only exports a single bucket, the origin can be configured with top-level config variables (which could also be configured with their equivalent environment variables): + +``` +Origin: + StorageType: "s3" + S3ServiceUrl: "https://my-s3.com" + S3Region: "us-west-2" + S3UrlStyle: "path" + + FederationPrefix: /my/namespace + S3Bucket: "my-bucket" + S3AccessKeyfile: "/path/to/access.key" + S3SecretKeyfile: "/path/to/secret.key" + + # Set up origin capabilities that are also applied to the bucket + EnableWrites: false + EnableReads: true + EnableListings: false + EnableDirectReads: true +``` + +### Exporting An Entire S3 Endpoint + +In some cases, it may be infeasible to set up an origin that exports every bucket you'd like to make accessible via a Pelican federation. For example, [Amazon's Open Data program](https://aws.amazon.com/opendata) hosts many terabytes of public data across thousands of buckets and a handful of regions. Manually enumerating all of these buckets in an origin config would quickly become intractable. Instead, Pelican provides a mechanism that allows you to export all the public buckets from an S3 endpoint. This is accomplished by omitting the bucket field when you set up the export. The following example could be used to set up an origin that exports AWS public data from the `us-east-1` region. + +``` +Origin: + # Things that configure the origin itself + # Tell the origin it will be serving objects from S3 + StorageType: "s3" + S3ServiceUrl: "https://s3.us-east-1.amazonaws.com" + S3Region: "us-east-1" + S3UrlStyle: "virtual" + + # The actual namespaces we export. Each export is defined + # via its own export block + Exports: + - FederationPrefix: /aws-public + Capabilities: ["PublicReads", "Listings", "DirectReads"] +``` + +In this configuration, users who wish to fetch objects from the origin will still need to know the name of the bucket that hosts those objects. For example, the AWS public bucket `noaa-wod-pds` has an object called `MD5SUMS`, and with this configuration the object can be fetched at `/aws-public/noaa-wod-pds/MD5SUMS`. -Pelican by default launches an origin server with a POSIX storage backend, which accesses files through your operating system. Pelican origin also supports S3 as the storage backend. We are working hard to document how to configure and launch a Pelican origin with an S3 storage backend. If you are currently interested in this approach, please contact help@pelicanplatform.org for further instructions. ## Login to Admin Website -The next step is to initialize the website for admin to management the origin. Go to the URL specified in the terminal above. By default, it should point to https://localhost:8444/view/initialization/code/ +After your origin is running, the next step is to initialize its web interface, which can be used by administrators for monitoring and further configuration. To initialize this interface, go to the URL specified in the terminal. By default, it should point to https://localhost:8444/view/initialization/code/ You will be directed to the page to activate the website with a one-time passcode. Copy the passcode from the terminal where you launch Pelican origin and paste to the website to finish activation. @@ -122,48 +308,50 @@ You will be directed to the page to activate the website with a one-time passcod In our case, it's `551220` from the example terminal above. -> Note that your one-time passcode will be different from the example. +> **NOTE:** that your one-time passcode will be different from the example. -> Also note that the one-time passcode will be refreshed every minute. Find the latest passcode in the terminal before proceeding. +> **NOTE:** These one-time passcodes will be refreshed every minute. Find the latest passcode in the terminal before proceeding. ### Set up password for the admin -After activating the website, you will be redirected to set up the password for the admin account. Type your password and re-type again to confirm. +After activating the website, you will be redirected to set up the password for the admin account. Type your password and re-type again to confirm. Then store this password in a safe location. -### Visit origin dashboard page +### Visit the Origin's Dashboard Page -Once confirming the new password, you will be redirected to the dashboard page of the origin website. +Once the password is confirmed, you will be redirected to the origin's dashboard page. -Where the graph on the right-side visualizes the file transfer metrics that records the transfer **speed** for both receiving (rx) and transmitting (tx) data. You may change the time range of the graph by changing the **Reporting Period** and **Gragh Settings**. +Here, the graph on the right visualizes object transfer metrics like transfer **speed** for both receiving (rx) and transmitting (tx) data. You may change the time range of the graph by changing the **Reporting Period** and **Gragh Settings**. -> Note that the graph can be empty at the server start, as it takes a couple of minutes to collect the first data. Refresh the page after the origin server runs for 5 minutes and you should start to see data points coming. +> **NOTE:** This graph may be empty when the origin first starts, as it takes several minutes to collect enough data for the display. Try refreshing the page after the origin has been running for ~5 minutes and you you should see data being aggregated. -The **Status** panel shows the health status of the origin by different components. +The **Status** panel shows information about your origin's health. It's components are: -* **Director** status indicates if the origin can advertise itself to the director, so that the direct can redirect file access from the client to the origin server. -* **Federation** status indicates if the origin can fetch metadata from the federation URL endpoint to know where each federation server is located (director and registry) -* **Web UI** status indicates if the admin website is successfully launched. -* **XRootD** status indicates if the underlying file transfer software that Pelican uses is functioning. +* **Director** This indicates whether the origin can advertise itself to its federation director, which is required for other members in the federation to discover your origin's existence and how to access objects from it. +* **Federation** This indicates whether the origin can fetch metadata from the federation URL, which the origin uses to discover the federation's central services (Director and Registry). +* **Web UI** This indicates whether the admin website is successfully configured and running. +* **XRootD** This indicates whether Pelican's underlying file transfer software is functioning as expected. -The **Data Exports** panel lists the currently exported directory on the host machine and their corresponding namespace prefix for Pelican. +The **Data Exports** panel lists information about the federation prefixes that are currently being exported by the origin The **Federation Overview** panel lists the links to various federation services (director, registry, etc.). Note that the link to the **Discovery** item is the endpoint where the metadata of a federation is located. ### For local deployment -When you hit the URL at https://localhost:8444/view/initialization/code/, You may see a warning that looks like the following (with some differences with respect to the browser): +When you hit the URL at https://localhost:8444/view/initialization/code/, You may see a warning that looks like the following (with some differences depending on the browser you use): -The warning is due to the fact that Pelican servers by default use `https` for network requests, which requires a set of TLS certificates to secure the connection between the server and the browser. If you don't have TLS certifacates configured and turned on `TLSSkipVerify` configuration parameter, then the Pelican origin will generate a set of self-signed TLS certifacates that are not trusted by the browser. However, it's OK to proceed with the warning for local deployment. +The warning is due to the fact that Pelican servers by default use `https` for network requests, which requires a set of TLS certificates to secure the connection between the server and the browser. If you don't have TLS certifacates properly configured and you turned on the `TLSSkipVerify` configuration parameter, then the origin will generate a set of self-signed certifacates that are not trusted by the browser. + +For local testing, it's OK to proceed with the warning for local deployment. ## Test Origin Functionality -Once you have your origin set up, follow the steps below to test if your origin can serve a file through a Pelican federation. +Once you have your origin set up, follow the steps below to test if your origin can serve a file through a Pelican federation. It's best to test your origin while it's serving public data to minimize the risk that any test tokens you generate may be malformed and the reason objects can't be pulled through the origin. 1. Create a test file under the directory on your host machine that binds to a Pelican namespace. This the `` in `-v :` argument when you run the Pelican origin. Assuming your directory is `/tmp/demo`, run the following command to create a test file named `testfile.txt` under `/tmp/demo` @@ -171,7 +359,7 @@ Once you have your origin set up, follow the steps below to test if your origin echo "This is a test file.\n" > /tmp/demo/testfile.txt ``` -2. In a **seperate terminal**, run the following command to get the data from your origin through the Pelican federation +2. In a **separate terminal**, run the following command to get the data from your origin through the Pelican federation ``` $ cd ~ From 9a79b6531ea3a758a0c6b8e9d52f7e53ab293333 Mon Sep 17 00:00:00 2001 From: Justin Hiemstra Date: Tue, 16 Apr 2024 14:28:00 +0000 Subject: [PATCH 2/4] Update with feedback from review --- docs/pages/serving_an_origin.mdx | 32 ++++++++++++++++---------------- docs/parameters.yaml | 28 ++++++++++++++-------------- 2 files changed, 30 insertions(+), 30 deletions(-) diff --git a/docs/pages/serving_an_origin.mdx b/docs/pages/serving_an_origin.mdx index 964b080d9..c4931dafd 100644 --- a/docs/pages/serving_an_origin.mdx +++ b/docs/pages/serving_an_origin.mdx @@ -23,11 +23,11 @@ For _macOS_ and _Windows_ users who want to serve a Pelican origin, please use [ ### Open Firewall Port for Pelican Origin -At their core, Pelican origins are web servers that listen to two TCP ports for file transfers and Web UI. By default, the browser/API interface for your origin will be at port `8444`, and the port for object transfers will be at `8443`. You may change these port numbers through the [configuration file](./parameters.mdx) with parameters [`Origin.Port`](./parameters.mdx#Origin-Port) and [`Server.WebPort`](./parameters.mdx#Server-WebPort) respectively. +At their core, Pelican origins are web servers that listen to two TCP ports for file transfers and Web UI. By default, the Web UI and API interface for your origin will be at port `8444`, and the port for object transfers will be at `8443`. You may change these port numbers through the [configuration file](./parameters.mdx) with parameters [`Server.WebPort`](./parameters.mdx#Server-WebPort) and [`Origin.Port`](./parameters.mdx#Origin-Port) respectively. In order for Pelican origins to work properly, these ports need to be accessible by the federation, which in most cases means they need to be open to the internet. If your server host has a firewall policy in place, please open these two ports for both incoming the outgoing TCP requests. -> **NOTE:** If it is not possible for you to expose any ports through the firewall (eg you're on a local network or behind a NAT), Pelican has a special feature called a _Connection Broker_ that allows you to serve origins without publicly-accessible ports and TLS credentials. However, this is an experimental feature and requires the Pelican federation you are joining to be compatible. If you are interested in learning more about the Connection Broker, please contact help@pelicanplatform.org for further instructions. +> **NOTE:** If it is not possible for you to expose any ports through the firewall (e.g. you're on a local network or behind a NAT), Pelican has a special feature called a _Connection Broker_ that allows you to serve origins without publicly-accessible ports and TLS credentials. However, this is an experimental feature and requires the Pelican federation you are joining to be compatible. If you are interested in learning more about the Connection Broker, please contact help@pelicanplatform.org for further instructions. ### Prepare TLS Credentials @@ -61,7 +61,7 @@ Since your TLS certificate is associated with your domain name, you will need to When you've completed the aforementioned steps, you're ready to start configuring the origin that will add your data to a federation. Serving an origin is the process of taking some underlying storage repository and making its data accessible via a namespace prefix in your federation. For example, you might make files in the directory `/my/directory` available at the federation path `/my/namespace` so that anyone with access to the federation can get objects from the directory -By default, Pelican origins serve files from a POSIX backend, the filesystem used by Linux computers. However, Pelican aims to support a variety of backends and we currently also support serving objects from S3. Configuration for S3 is mostly similar to configuration for POSIX filesystems, but with a few important differences. For information about S3 backends, jump to our [s3 documentation](#launch-the-origin-with-an-s3-storage-backend) +By default, Pelican origins serve files from a POSIX backend, the filesystem used by Linux computers. However, Pelican aims to support a variety of backends and we currently also support serving objects from S3. Configuration for S3 is mostly similar to configuration for POSIX filesystems, but with a few important differences. For information about S3 backends, jump to the [S3 documentation](#launch-the-origin-with-an-s3-storage-backend) below. > If you are running Pelican docker image to serve an origin, please refer to the [docker image documentation](./install/docker.mdx#run-pelican-origin-server). @@ -73,9 +73,9 @@ Federations are identified by a URL, which is used to host information your orig To point your origin at a specific federation, you can either pass the `-f ` flag if running from the command line, or configure `Federation.DiscoveryUrl: ` in your config yaml. -### Serve the Origin +### Starting the Origin -Origins can be configured via the command line, via a config file named `pelican.yaml`, via environment variables, or through a combinations of the three. While simple origins can be run entirely from command line arguments, more complex origins will require configuration your your `pelican.yaml`. +Origins can be configured via the command line, a config file named `pelican.yaml`, environment variables, or through a combinations of the three. While simple origins can be run entirely from command line arguments, more complex origins will require configuration your your `pelican.yaml`. To start a simple pelican origin from the command line that serves POSIX data, run: @@ -89,11 +89,11 @@ Where: * `` is the absolute path to the directory containing files you want to export as Pelican objects in your federation * `` is the federation prefix at which files in `/path/to/data` will be accessed from in the federation. Note that federation prefixes follow POSIX path conventions, and they must begin with `/` to denote an absolute path. -> **NOTE:** By default, origins require authorization tokens for object access. There's currently no way to serve a public origin using only the command line, but you can find more information about various access controls by looking at [origin capabilities](#origin-and-namespace-capabilities) below. +> **NOTE:** By default, origins require authorization tokens for object access. Pelican currently does not support serving a public origin using only the command line, but various access controls can be configured through your configuration file. For more information, see [origin capabilities](#origin-and-namespace-capabilities) below. -To run the same origin using a `pelican.yaml` configuration file, save your configuration to `/etc/pelican/pelican.yaml` if you're running Pelican as root, and at `~/.config/pelican/pelican.yaml` if you're running as a non-root user. The command line origin from above could be configured accordingly: +To run the same origin using a `pelican.yaml` configuration file, save your configuration to `/etc/pelican/pelican.yaml` if you're running Pelican as root, or at `~/.config/pelican/pelican.yaml` if you're running as a non-root user. The command line origin from above could be configured accordingly: -``` +```yaml filename="pelican.yaml" # Tell Pelican which federation you're joining Federation: DiscoveryUrl: @@ -118,11 +118,11 @@ pelican origin serve Pelican will read the config file and apply it to your origin. -Finally, origins can be configured to a limited extent with environment variables. In Pelican's environment variable model, configuration options are taken from `pelican.yaml`, flattened, and prepended with either `PELICAN_` or `OSDF_`, depending on the name of the binary you're using (ie whether you run `osdf serve` or `pelican serve` commands). +Finally, origins can be configured to a limited extent with environment variables. In Pelican's environment variable model, configuration options are taken from `pelican.yaml`, flattened, and prepended with either `PELICAN_` or `OSDF_`, depending on the name of the binary you're using (i.e. whether you run `osdf serve` or `pelican serve` commands). For example, you might configure the origin's storage type by setting the environment variable `PELICAN_ORIGIN_STORAGETYPE=posix`. -> Environment variable configuration does not support the same complex structures that can be built with yaml configuration, such as lists. +> Environment variable configuration does not support complex structures that can be built with yaml configuration, such as `object`-type parameters. The first time the origin is started, you will see something that looks like the following: @@ -133,7 +133,7 @@ Pelican admin interface is not initialized To initialize, login at https://localhost:8444/view/initialization/code/ with the following code: 551220 ``` -See the [browser configuration](#login-to-admin-website) documentation section for more information about initializing your origin's web interface. +See the [admin website configuration](#login-to-admin-website) documentation section for more information about initializing your origin's admin website. ### Origin and Namespace Capabilities @@ -142,12 +142,12 @@ Capabilities are the configuration options you can assign to origins and namespa - "Reads": When included, objects from the namespace can be read with a valid authorization token. - "PublicReads": When set, objects from the namespace become public and require no authorization to read. - "Writes": When included, objects can be written back to the storage backend by Pelican. Write operations _always_ require a valid authorization token. -- "DirectReads": When included, a namespace indicates that data being pulled from it should only be pulled through a cache and not directly by clients. This may be useful in cases where the origin isn't very performant or has to pay egress costs when data moves through it. Note that this is respected by federation central services, but may not be respected by all clients. +- "DirectReads": When included, a namespace indicates that it is willing to serve clients directly and does not require data to be pulled through a cache. Disabling this feature may be useful in cases where the origin isn't very performant or has to pay egress costs when data moves through it. Note that this is respected by federation central services, but may not be respected by all clients. - "Listings": When included, the namespace indicates it will allow object discovery. Be careful when setting this for authorized namespaces, as this will allow anyone to discover the names of objects exported by this namespace. > **NOTE:** Most origins should have either "Reads" or "PublicReads" enabled. If neither is set, the origin won't export any data. -There is an important distinction between _origin_ capabilities and _namespace_ capabilities. While it's sometimes easy to treat origins and namespaces as the same thing, Pelican must distinguish between them because two separate origins may export portions of the same namespace, and a single origin may export two disparate prefixes. The only exception to this rule is when a single origin serves a single namespace. +There is an important distinction between _origin_ capabilities and _namespace_ capabilities. While it's sometimes easy to treat origins and namespaces as the same thing, Pelican must distinguish between them because two separate origins may export portions of the same namespace, and a single origin may export two disparate prefixes. The only exception to this rule is when a single origin serves a single namespace, or the origin exports multiple prefixes that should all have the same capabilities. To configure _origin_ capabilities, you can set top-level options for the origin: @@ -163,7 +163,7 @@ If no `Origin.Exports` block is provided to Pelican, these values will also be a ### Multi-Export Origins -The previous examples have shown how one might export a single namespace, but Pelican origins can export multiple paths from the same storage backend under different namespaces. For example, you have have two posix directories called `/my/data/public` and `/my/data/private`. If you want to make your public data available under the namespace `/my/prefix/public` and your private data available under `/my/prefix/private`, you'll need to configure a multi-export origin, which is accomplished through the origin's `Exports` block. Below is an example of what that looks like, along with how you could configure access control for the two namespaces: +The previous examples have shown how one might export a single namespace, but Pelican origins can export multiple paths from the same storage backend under different namespaces. For example, assume you have have two posix directories called `/my/data/public` and `/my/data/private`. If you want to make your public data available under the namespace `/my/prefix/public` and your private data available under `/my/prefix/private`, you'll need to configure a multi-export origin, which is accomplished through the origin's `Exports` block. Below is an example of what that looks like, along with how you could configure access control for the two namespaces: ``` Federation: @@ -215,7 +215,7 @@ Serving S3 origins with Pelican is similar to serving POSIX origins, but with se - The URL you use to access objects from S3, also known as the _S3 Service URL_ - The _region_ that your S3 instance is hosted out of (almost always `us-east-1` unless you're actually using S3 from Amazon) -- The name of the bucket your objects are stored in +- The name of the _bucket_ your objects are stored in - The type of bucket hosting used at the S3 service URL, which can be either _path_ or _virtual_. This determines whether objects are normally accessed like `https:////` (path-style hosting) or `https://./` (virtual-style hosting), but does not change the way you access objects through Pelican. In many cases, it's safe to assume _path_-style hosting, and this is set to Pelican's default. For more information about different hosting styles in S3, see the [AWS documentation](https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html). > **NOTE:** Pelican has a special mode where no bucket information is provided that allows you to export objects from all public buckets at a given service URL. This is covered in further detail [later in this section](#exporting-an-entire-s3-endpoint). @@ -300,7 +300,7 @@ In this configuration, users who wish to fetch objects from the origin will stil ## Login to Admin Website -After your origin is running, the next step is to initialize its web interface, which can be used by administrators for monitoring and further configuration. To initialize this interface, go to the URL specified in the terminal. By default, it should point to https://localhost:8444/view/initialization/code/ +After your origin is running, the next step is to initialize its admin website, which can be used by administrators for monitoring and further configuration. To initialize this interface, go to the URL specified in the terminal. By default, it should point to https://localhost:8444/view/initialization/code/ You will be directed to the page to activate the website with a one-time passcode. Copy the passcode from the terminal where you launch Pelican origin and paste to the website to finish activation. diff --git a/docs/parameters.yaml b/docs/parameters.yaml index 4fa3d41d6..8514f30c5 100644 --- a/docs/parameters.yaml +++ b/docs/parameters.yaml @@ -484,8 +484,8 @@ name: Origin.FederationPrefix description: |+ The namespace prefix of the origin's contents within the federation. - NOTE: This config option is incompatible with multiple exports defined via `Origin.Exports` and requires that the origin exports - only a single path. + NOTE: This config option is incompatible with multiple exports defined via `Origin.Exports` and is ignored when the origin + exports multiple prefixes. type: string default: none components: ["origin"] @@ -496,8 +496,8 @@ description: |+ of "posix", this constitutes the path on disk exported by the origin for the federation. If the origin has a StorageType of "s3", this value is not currently used. - NOTE: This config option is incompatible with multiple exports defined via `Origin.Exports` and requires that the origin exports - only a single path. + NOTE: This config option is incompatible with multiple exports defined via `Origin.Exports` and is ignored when the origin + exports multiple prefixes. type: string default: none components: ["origin"] @@ -516,8 +516,8 @@ description: |+ A boolean indicating whether the origin permits reads without valid authorization. When false, reads from the origin will require a properly-scoped authorization token signed by the origin's issuer. - NOTE: This config option is incompatible with multiple exports defined via `Origin.Exports` and requires that the origin exports - only a single path. + NOTE: This config option is meant to configure an _origin's_ capabilities, but can be used to configure a namespace when the origin + exports only a single prefix or when every exported namespace should inherit the same configuration. type: bool default: false components: ["origin"] @@ -526,8 +526,8 @@ name: Origin.EnableReads description: |+ A boolean indicating whether the origin permits any reads. When false, the origin may still allow writes. - NOTE: This config option is incompatible with multiple exports defined via `Origin.Exports` and requires that the origin exports - only a single path. + NOTE: This config option is meant to configure an _origin's_ capabilities, but can be used to configure a namespace when the origin + exports only a single prefix or when every exported namespace should inherit the same configuration. type: bool default: true components: ["origin"] @@ -536,8 +536,8 @@ name: Origin.EnableWrites description: |+ A boolean indicating whether the origin permits writes. All writes require authorization. - NOTE: This config option is incompatible with multiple exports defined via `Origin.Exports` and requires that the origin exports - only a single path.type: bool. + NOTE: This config option is meant to configure an _origin's_ capabilities, but can be used to configure a namespace when the origin + exports only a single prefix or when every exported namespace should inherit the same configuration. type: bool default: true components: ["origin"] @@ -546,8 +546,8 @@ name: Origin.EnableListings description: |+ A boolean indicating whether the origin permits object listings. When true, clients can list the contents of the origin. - NOTE: This config option is incompatible with multiple exports defined via `Origin.Exports` and requires that the origin exports - only a single path. + NOTE: This config option is meant to configure an _origin's_ capabilities, but can be used to configure a namespace when the origin + exports only a single prefix or when every exported namespace should inherit the same configuration. type: bool default: true components: ["origin"] @@ -557,8 +557,8 @@ description: |+ A boolean indicating whether the origin permits direct reads. When true, the origin indicates that it is willing to interact directly with clients. When false, the origin is indicating it is only willing to interact with clients via a cache service. - NOTE: This config option is incompatible with multiple exports defined via `Origin.Exports` and requires that the origin exports - only a single path. + NOTE: This config option is meant to configure an _origin's_ capabilities, but can be used to configure a namespace when the origin + exports only a single prefix or when every exported namespace should inherit the same configuration. type: bool default: true components: ["origin"] From a1136fa7f4d9a2c32510069447a78f0c3fde267b Mon Sep 17 00:00:00 2001 From: Justin Hiemstra Date: Tue, 16 Apr 2024 14:55:31 +0000 Subject: [PATCH 3/4] Stick multi-export/S3 docs in dropdown accordion --- docs/pages/serving_an_origin.mdx | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/docs/pages/serving_an_origin.mdx b/docs/pages/serving_an_origin.mdx index c4931dafd..86c3d534a 100644 --- a/docs/pages/serving_an_origin.mdx +++ b/docs/pages/serving_an_origin.mdx @@ -162,6 +162,7 @@ If no `Origin.Exports` block is provided to Pelican, these values will also be a > **NOTE:** Pelican tries to resolve differences between origin and namespace configurations by respecting the more restrictive of the two. If you serve an origin that enables public reads, but the underlying prefix it exports disables all reads, you won't be able to read from that namespace. ### Multi-Export Origins +
Click to see more... The previous examples have shown how one might export a single namespace, but Pelican origins can export multiple paths from the same storage backend under different namespaces. For example, assume you have have two posix directories called `/my/data/public` and `/my/data/private`. If you want to make your public data available under the namespace `/my/prefix/public` and your private data available under `/my/prefix/private`, you'll need to configure a multi-export origin, which is accomplished through the origin's `Exports` block. Below is an example of what that looks like, along with how you could configure access control for the two namespaces: @@ -185,6 +186,7 @@ Origin: ``` > **NOTE:** While multiple namespaces can be exported by the same origin, they must all have the same underlying storage type. That is, if the origin serves files from POSIX, it must only serve files from POSIX and not S3. +
### Additional Command Line Arguments for Origins @@ -202,6 +204,7 @@ This section documents additional arguments you can pass via the command line wh For more information about available yaml configuration options, refer to the [Parameters page](./parameters.mdx). ## Launch the Origin With an S3 Storage Backend +
Click to see more... ### What is S3? @@ -296,7 +299,7 @@ Origin: ``` In this configuration, users who wish to fetch objects from the origin will still need to know the name of the bucket that hosts those objects. For example, the AWS public bucket `noaa-wod-pds` has an object called `MD5SUMS`, and with this configuration the object can be fetched at `/aws-public/noaa-wod-pds/MD5SUMS`. - +
## Login to Admin Website From 0f779dc69531cab1f61096cd279404aa20826c46 Mon Sep 17 00:00:00 2001 From: Justin Hiemstra Date: Tue, 16 Apr 2024 15:13:56 +0000 Subject: [PATCH 4/4] Revise some language after another origin docs readthrough --- docs/pages/serving_an_origin.mdx | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/pages/serving_an_origin.mdx b/docs/pages/serving_an_origin.mdx index 86c3d534a..9f5fe79e1 100644 --- a/docs/pages/serving_an_origin.mdx +++ b/docs/pages/serving_an_origin.mdx @@ -4,7 +4,7 @@ import ExportedImage from "next-image-export-optimizer"; Pelican users who want to share data within a Pelican federation do so via an *Origin*. Origins are a crucial component of Pelican's architecture for two reasons: they act as an adapter between various storage backends and Pelican federations, and they provide fine-grained access controls for that data. That is, they figure out how to take data from wherever it lives (such as a POSIX filesystem, S3 buckets, HTTPS servers, etc.) and transform it into a format that the federation can utilize while respecting your data access requirements. -> An important distinction between origins and data backends is that, generally speaking, origins do **NOT** store any data themselves; their primary function is to facilitate data accessibility. +> **NOTE:** An important distinction between origins and data backends is that, generally speaking, origins do **NOT** store any data themselves; their primary function is to facilitate data accessibility. This document contains instructions on how to serve a Pelican origin on top of a variety of storage backend types. @@ -19,11 +19,11 @@ If you prefer to install Pelican as a standalone binary, you need to follow [add > **NOTE:** Serving origins with a standalone Pelican binary is possible, but not recommended. -For _macOS_ and _Windows_ users who want to serve a Pelican origin, please use [Pelican docker image](./install/docker.mdx). +_MacOS_ and _Windows_ users who want to serve a Pelican origin should use the [Pelican docker image](./install/docker.mdx). ### Open Firewall Port for Pelican Origin -At their core, Pelican origins are web servers that listen to two TCP ports for file transfers and Web UI. By default, the Web UI and API interface for your origin will be at port `8444`, and the port for object transfers will be at `8443`. You may change these port numbers through the [configuration file](./parameters.mdx) with parameters [`Server.WebPort`](./parameters.mdx#Server-WebPort) and [`Origin.Port`](./parameters.mdx#Origin-Port) respectively. +At their core, Pelican origins are web servers that listen to two TCP ports for file transfers and Web UI. By default, the Web UI and API interface for your origin will be at port `8444`, and the port for object transfers will be at `8443`. You may change these port numbers through the [configuration file](./parameters.mdx) with parameters [`Server.WebPort`](./parameters.mdx#Server-WebPort) and [`Origin.Port`](./parameters.mdx#Origin-Port), respectively. In order for Pelican origins to work properly, these ports need to be accessible by the federation, which in most cases means they need to be open to the internet. If your server host has a firewall policy in place, please open these two ports for both incoming the outgoing TCP requests. @@ -69,7 +69,7 @@ By default, Pelican origins serve files from a POSIX backend, the filesystem use Before serving an origin, you need to decide which [**federation**](./client-usage.mdx#useful-terminology) your data will be accessed through. For example, the Open Science Data Federation (OSDF) is Pelican's flagship federation, and if you are interested in serving an OSDF origin, you can refer to the [OSDF website](https://osg-htc.org/services/osdf.html) for details about how to join. -Federations are identified by a URL, which is used to host information your origin will need in order to discover other federation services. For example, the OSDF's federation URL is `https://osg-htc.org`, and an origin that joins the OSDF will visit `https://osg-htc.org/.well-known/pelican-configuration` to get important metadata about the federation's central services (the Director and Registry). +Federations are identified their URL, which is used to host information that origins need for discovering other federation services. For example, the OSDF's federation URL is `https://osg-htc.org`, and an origin that joins the OSDF will visit `https://osg-htc.org/.well-known/pelican-configuration` to get important metadata about the federation's central services (the Director and Registry). To point your origin at a specific federation, you can either pass the `-f ` flag if running from the command line, or configure `Federation.DiscoveryUrl: ` in your config yaml. @@ -122,7 +122,7 @@ Finally, origins can be configured to a limited extent with environment variable For example, you might configure the origin's storage type by setting the environment variable `PELICAN_ORIGIN_STORAGETYPE=posix`. -> Environment variable configuration does not support complex structures that can be built with yaml configuration, such as `object`-type parameters. +> **NOTE:** Environment variable configuration does not support complex structures that can be built with yaml configuration, such as `object`-type parameters. The first time the origin is started, you will see something that looks like the following: @@ -137,7 +137,7 @@ See the [admin website configuration](#login-to-admin-website) documentation sec ### Origin and Namespace Capabilities -Capabilities are the configuration options you can assign to origins and namespace prefixes to determine what kinds of access controls you want them to respect. In the previous yaml configuration, we configured the origin capabilities using the `Capabilities` list of the `Origin.Exports` block by specifying that the federation prefix `/your/federation/prefix` supports "Reads" and "Writes". This list of capabilities can be used for further control of what types of operations the namespace is willing to support. Available capabilities include: +Origins and namespaces can be configured with a set of _capabilities_, which are the configuration options used to define data access controls. In the previous yaml configuration, we configured the origin capabilities using the `Capabilities` list of the `Origin.Exports` block by specifying that the federation prefix `/your/federation/prefix` supports "Reads" and "Writes". This list of capabilities can be used for further control of what types of operations the namespace is willing to support. Available capabilities include: - "Reads": When included, objects from the namespace can be read with a valid authorization token. - "PublicReads": When set, objects from the namespace become public and require no authorization to read.