This example Terraform configuration demonstrates how to use the DAOS Terraform Modules to deploy a DAOS cluster consisting of servers and clients.
If you have not completed the pre-deployment steps please complete those steps before continuing to run this Terraform example.
Click the button below to run this example in a Cloudshell tutorial. The tutorial will walk through each of the steps described in this README.md file.
List of Terraform files in this example
Filename | Description |
---|---|
main.tf | Main Terrform configuration file containing resource definitions |
variables.tf | Variable definitions for variables used in main.tf |
versions.tf | Provider definitions |
terraform.tfvars.perf.example | Pre-Configured set of set of variables focused on performance |
terraform.tfvars.tco.example | Pre-Configured set of set of variables focused on lower total cost of ownership |
The following sections describe how to deploy a DAOS cluster with this example Terraform configuration.
Before you run any terraform
commands you need to create a terraform.tfvars
file in the terraform/examples/daos_cluster
directory.
The terraform.tfvars
file will contain the variable values for the configuration.
To ensure a successful deployment of a DAOS cluster there are two terraform.tfvars.*.example
files that you can choose from.
You will need to decide which of these files to copy to terraform.tfvars
.
The terraform.tfvars.tco.example
contains variables for a DAOS cluster deployment with
- 16 DAOS Client instances
- 4 DAOS Server instances Each server instance has sixteen 375GB NVMe SSDs
To use the terraform.tfvars.tco.example
file
cp terraform.tfvars.tco.example terraform.tfvars
The terraform.tfvars.perf.example
contains variables for a DAOS cluster deployment with
- 16 DAOS Client instances
- 4 DAOS Server instances Each server instances has four 375GB NVMe SSDs
To use the terraform.tfvars.perf.example
file run
cp terraform.tfvars.perf.example terraform.tfvars
Now that you have a terraform.tfvars
file you need to replace the <project_id>
placeholder in the file with your GCP project id.
To update the project id in terraform.tfvars
run
PROJECT_ID=$(gcloud config list --format 'value(core.project)')
sed -i "s/<project_id>/${PROJECT_ID}/g" terraform.tfvars
Billing Notification!
Running this example will incur charges in your project.
To avoid surprises, be sure to monitor your costs associated with running this example.
Don't forget to shut down the DAOS cluster with
terraform destroy
when you are finished.
To deploy the DAOS cluster
terraform init
terraform plan -out=tfplan
terraform apply tfplan
After your DAOS cluster has been deployed you can log into the first DAOS client instance to perform administrative tasks.
Verify that the daos-client and daos-server instances are running
gcloud compute instances list \
--filter="name ~ daos" \
--format="value(name,INTERNAL_IP)"
Log into the first client instance
gcloud compute ssh daos-client-0001
sudo dmg system query -v
The State column should display "Joined" for all servers.
Rank UUID Control Address Fault Domain State Reason
---- ---- --------------- ------------ ----- ------
0 0796c576-5651-4e37-aa15-09f333d2d2b8 10.128.0.35:10001 /daos-server-0001 Joined
1 f29f7058-8abb-429f-9fd3-8b13272d7de0 10.128.0.77:10001 /daos-server-0003 Joined
2 09fc0dab-c238-4090-b3f8-da2bd4dce108 10.128.0.81:10001 /daos-server-0002 Joined
3 2cc9140b-fb12-4777-892e-7d190f6dfb0f 10.128.0.30:10001 /daos-server-0004 Joined
Check free NVMe storage.
sudo dmg storage query usage
From the output you can see there are 4 servers each with 1.6TB free. That means there is a total of 6.4TB free.
Hosts SCM-Total SCM-Free SCM-Used NVMe-Total NVMe-Free NVMe-Used
----- --------- -------- -------- ---------- --------- ---------
daos-server-0001 48 GB 48 GB 0 % 1.6 TB 1.6 TB 0 %
daos-server-0002 48 GB 48 GB 0 % 1.6 TB 1.6 TB 0 %
daos-server-0003 48 GB 48 GB 0 % 1.6 TB 1.6 TB 0 %
daos-server-0004 48 GB 48 GB 0 % 1.6 TB 1.6 TB 0 %
Create one pool that uses the entire 6.4TB.
sudo dmg pool create -z 6.4TB -t 3 --label=pool1
For more information about pools see
- https://docs.daos.io/latest/overview/storage/#daos-pool
- https://docs.daos.io/latest/admin/pool_operations/
Create a container in the pool
daos container create --type=POSIX --properties=rf:0 --label=cont1 pool1
For more information about containers see https://docs.daos.io/latest/overview/storage/#daos-container
Mount the container with dfuse
MOUNT_DIR="${HOME}/daos/cont1"
mkdir -p "${MOUNT_DIR}"
dfuse --singlethread --pool=pool1 --container=cont1 --mountpoint="${MOUNT_DIR}"
df -h -t fuse.daos
You can now store files in the DAOS container mounted on ${HOME}/daos/cont1
.
For more information about DFuse see the DAOS FUSE section of the User Guide.
The cont1
container is now mounted on ${HOME}/daos/cont1
Create a 20GiB file which will be stored in the DAOS filesystem.
cd ${HOME}/daos/cont1
time LD_PRELOAD=/usr/lib64/libioil.so \
dd if=/dev/zero of=./test21G.img bs=1G count=20
fusermount -u ${HOME}/daos/cont1
To destroy the DAOS cluster run
terraform destroy
This will shut down all DAOS server and client instances.
Documentation for the terraform/examples/daos_cluster
Terraform configuration.
Copyright 2022 Intel Corporation
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Name | Version |
---|---|
terraform | >= 0.14.5 |
>= 3.54.0 |
No providers.
Name | Source | Version |
---|---|---|
daos_client | ../../modules/daos_client | n/a |
daos_server | ../../modules/daos_server | n/a |
No resources.
Name | Description | Type | Default | Required |
---|---|---|---|---|
allow_insecure | Sets the allow_insecure setting in the transport_config section of the daos_*.yml files | bool |
false |
no |
client_gvnic | Use Google Virtual NIC (gVNIC) network interface on DAOS clients | bool |
false |
no |
client_instance_base_name | MIG instance base names to use | string |
"daos-client" |
no |
client_labels | Set of key/value label pairs to assign to daos-client instances | any |
{} |
no |
client_machine_type | GCP machine type. ie. c2-standard-16 | string |
"c2-standard-16" |
no |
client_mig_name | MIG name | string |
"daos-client" |
no |
client_number_of_instances | Number of daos clients to bring up | number |
4 |
no |
client_os_disk_size_gb | OS disk size in GB | number |
20 |
no |
client_os_disk_type | OS disk type ie. pd-ssd, pd-standard | string |
"pd-ssd" |
no |
client_os_family | OS GCP image family | string |
"daos-client-hpc-centos-7" |
no |
client_os_project | OS GCP image project name. Defaults to project_id if null. | string |
null |
no |
client_preemptible | If preemptible instances | string |
false |
no |
client_service_account | Service account to attach to the instance. See https://www.terraform.io/docs/providers/google/r/compute_instance_template.html#service_account. | object({ |
{ |
no |
client_template_name | MIG template name | string |
"daos-client" |
no |
network_name | Name of the GCP network to use | string |
"default" |
no |
project_id | The GCP project to use | string |
n/a | yes |
region | The GCP region to create and test resources in | string |
n/a | yes |
server_daos_crt_timeout | crt_timeout | number |
300 |
no |
server_daos_disk_count | Number of local ssd's to use | number |
16 |
no |
server_daos_disk_type | Daos disk type to use. For now only suported one is local-ssd | string |
"local-ssd" |
no |
server_daos_scm_size | scm_size | number |
200 |
no |
server_gvnic | Use Google Virtual NIC (gVNIC) network interface | bool |
false |
no |
server_instance_base_name | MIG instance base names to use | string |
"daos-server" |
no |
server_labels | Set of key/value label pairs to assign to daos-server instances | any |
{} |
no |
server_machine_type | GCP machine type. ie. e2-medium | string |
"n2-custom-36-215040" |
no |
server_mig_name | MIG name | string |
"daos-server" |
no |
server_number_of_instances | Number of daos servers to bring up | number |
4 |
no |
server_os_disk_size_gb | OS disk size in GB | number |
20 |
no |
server_os_disk_type | OS disk type ie. pd-ssd, pd-standard | string |
"pd-ssd" |
no |
server_os_family | OS GCP image family | string |
"daos-server-centos-7" |
no |
server_os_project | OS GCP image project name. Defaults to project_id if null. | string |
null |
no |
server_pools | List of pools and containers to be created | list(object({ |
[] |
no |
server_preemptible | If preemptible instances | string |
false |
no |
server_service_account | Service account to attach to the instance. See https://www.terraform.io/docs/providers/google/r/compute_instance_template.html#service_account. | object({ |
{ |
no |
server_template_name | MIG template name | string |
"daos-server" |
no |
subnetwork_name | Name of the GCP sub-network to use | string |
"default" |
no |
subnetwork_project | The GCP project where the subnetwork is defined | string |
null |
no |
zone | The GCP zone to create and test resources in | string |
n/a | yes |
No outputs.