Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multi-mount parallelstore support #3256

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 23 additions & 18 deletions modules/file-system/parallelstore/scripts/mount-daos.sh
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,17 @@ sed -i "s/#.*transport_config/transport_config/g" $daos_config
sed -i "s/#.*allow_insecure:.*false/ allow_insecure: true/g" $daos_config
sed -i "s/.*access_points.*/access_points: $access_points/g" $daos_config
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the goal here is only to support multiple mounts from the server Parallelstore instance, or to support mounts from multiple Parallelstore instances?

I fear, that with multiple Parallelstore instances, the last script that will run will setup access_points in the config, and it will be the only instance mounted (but multiple times)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal is to have 1 parallelstore instance but able to mount to multiple mount points (possible with different mount options).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, perfect. I think it is worth mentioning in the README, that module supports mounting only from Parallelstore instance, but it's possibile to mount it multiple times with different options using pre-existing-network-storage.

Probably worth mentioning in both modules.


# Get interface names with "s0f0" suffix
if ifconfig -a | grep 's0f0'; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that you are attempting to identify devices that are on the same PCI bus as GPUs in the A3 VM types (a3-highgpu-8g and a3-megagpu-8g). This is not a reliable mechanism by which to do that.

  1. I believe ifconfig is not uniformly installed across OS clients. (e.g. Ubuntu 22.04)
  2. The "tail" of the device name is the least predictable element of the automated device name.

The following command strictly identifies ethernet devices not attached to PCI slot 0, which is guaranteed to be nic0 as seen by the Cloud Console. It is also the interface guaranteed to route to the metadata server

find /sys/class/net/ -not -name 'enp0s*' -regextype posix-extended -regex '.*/enp[0-9]+s.*' -printf "%f\n" | paste -s -d ','

I believe you could this command and then test for whether it has zero length.

sof0_interfaces=$(ifconfig -a | grep 's0f0:' | awk '{print $1}' | tr ':' '\n' | grep -v '^$' | awk '!a[$0]++' | sed 's/^/"/g' | sed 's/$/"/g' | paste -sd, -)

# Append the sof0_interfaces to the existing list
exclude_fabric_ifaces="lo,$sof0_interfaces"

# Update the file with the new list
sed -i "s/#.*exclude_fabric_ifaces: \[.*/exclude_fabric_ifaces: [$exclude_fabric_ifaces]/" $daos_config
fi

# Start service
if { [ "${OS_ID}" = "rocky" ] || [ "${OS_ID}" = "rhel" ]; } && { [ "${OS_VERSION_MAJOR}" = "8" ] || [ "${OS_VERSION_MAJOR}" = "9" ]; }; then
# TODO: Update script to change default log destination folder, after daos_agent user is supported in debian and ubuntu.
Expand Down Expand Up @@ -69,39 +80,33 @@ sed -i "s/#.*user_allow_other/user_allow_other/g" $fuse_config
# make sure limit of open files is high enough for dfuse (1M of open files)
ulimit -n 1048576

for i in {1..10}; do
# To parse mount_options as --disable-wb-cache --eq-count=8.
# shellcheck disable=SC2086
dfuse -m "$local_mount" --pool default-pool --container default-container --multi-user $mount_options && break

echo "dfuse failed, retrying in 1 seconds (attempt $i/10)..."
sleep 1
done

if ! mountpoint -q "$local_mount"; then
exit 1
fi

# Store the mounting logic in a variable
mount_command='for i in {1..10}; do /bin/dfuse -m '$local_mount' --pool default-pool --container default-container --multi-user '$mount_options' --foreground && break; echo \"dfuse, failed, retrying in 1 second (attempt '$i'/10)\"; sleep 1; done'
mount_command="if mountpoint -q '$local_mount'; then fusermount3 -u '$local_mount'; fi; for i in {1..10}; do /bin/dfuse -m '$local_mount' --pool default-pool --container default-container --multi-user $mount_options --foreground && break; sleep 1; done"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove this line entirely. The command should not include sleep logic.


# Construct the service name with the local_mount suffix
service_name="mount_parallelstore_${local_mount//\//_}.service"
Copy link
Member

@tpdownes tpdownes Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use systemd-escape -p "${local_mount}" to generate that part of the unit name.


# --- Begin: Add systemd service creation ---
cat >/usr/lib/systemd/system/mount_parallelstore.service <<EOF
cat >/usr/lib/systemd/system/"${service_name}" <<EOF
[Unit]
Description=DAOS Mount Service
After=network-online.target daos_agent.service
Comment on lines 91 to 93
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[Unit]
Description=DAOS Mount Service
After=network-online.target daos_agent.service
[Unit]
Description=DAOS Mount Service
After=network-online.target daos_agent.service
Before=slurmd.service
ConditionPathIsMountPoint=!${local_mount}


[Service]
Type=oneshot
Type=simple
User=root
Group=root
ExecStart=/bin/bash -c '$mount_command'
Restart=always
RestartSec=1
ExecStart=/bin/bash -c "$mount_command"
ExecStop=fusermount3 -u '$local_mount'
Comment on lines +99 to +102
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Restart=always
RestartSec=1
ExecStart=/bin/bash -c "$mount_command"
ExecStop=fusermount3 -u '$local_mount'
Restart=on-failure
RestartSec=10
ExecStart=/bin/dfuse -m $local_mount --pool default-pool --container default-container --multi-user $mount_options --foreground
ExecStop=/usr/bin/fusermount3 -u $local_mount

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After you make these changes and we discuss how multi-Parallelstore instance suport would look like, I would like to consider using systemd unit templates to avoid re-creation of multiple service files.

https://fedoramagazine.org/systemd-template-unit-files/


[Install]
WantedBy=multi-user.target
EOF

systemctl enable mount_parallelstore.service
systemctl enable "${service_name}"
systemctl start "${service_name}"
# --- End: Add systemd service creation ---

exit 0
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,17 @@ sed -i "s/#.*transport_config/transport_config/g" $daos_config
sed -i "s/#.*allow_insecure:.*false/ allow_insecure: true/g" $daos_config
sed -i "s/.*access_points.*/access_points: $access_points/g" $daos_config

# Get interface names with "s0f0" suffix
if ifconfig -a | grep 's0f0'; then
sof0_interfaces=$(ifconfig -a | grep 's0f0:' | awk '{print $1}' | tr ':' '\n' | grep -v '^$' | awk '!a[$0]++' | sed 's/^/"/g' | sed 's/$/"/g' | paste -sd, -)

# Append the sof0_interfaces to the existing list
exclude_fabric_ifaces="lo,$sof0_interfaces"

# Update the file with the new list
sed -i "s/#.*exclude_fabric_ifaces: \[.*/exclude_fabric_ifaces: [$exclude_fabric_ifaces]/" $daos_config
fi

# Start service
if { [ "${OS_ID}" = "rocky" ] || [ "${OS_ID}" = "rhel" ]; } && { [ "${OS_VERSION_MAJOR}" = "8" ] || [ "${OS_VERSION_MAJOR}" = "9" ]; }; then
# TODO: Update script to change default log destination folder, after daos_agent user is supported in debian and ubuntu.
Expand Down Expand Up @@ -69,39 +80,33 @@ sed -i "s/#.*user_allow_other/user_allow_other/g" $fuse_config
# make sure limit of open files is high enough for dfuse (1M of open files)
ulimit -n 1048576

for i in {1..10}; do
# To parse mount_options as --disable-wb-cache --eq-count=8.
# shellcheck disable=SC2086
dfuse -m "$local_mount" --pool default-pool --container default-container --multi-user $mount_options && break

echo "dfuse failed, retrying in 1 seconds (attempt $i/10)..."
sleep 1
done

if ! mountpoint -q "$local_mount"; then
exit 1
fi

# Store the mounting logic in a variable
mount_command='for i in {1..10}; do /bin/dfuse -m '$local_mount' --pool default-pool --container default-container --multi-user '$mount_options' --foreground && break; echo \"dfuse, failed, retrying in 1 second (attempt '$i'/10)\"; sleep 1; done'
mount_command="if mountpoint -q '$local_mount'; then fusermount3 -u '$local_mount'; fi; for i in {1..10}; do /bin/dfuse -m '$local_mount' --pool default-pool --container default-container --multi-user $mount_options --foreground && break; sleep 1; done"

# Construct the service name with the local_mount suffix
service_name="mount_parallelstore_${local_mount//\//_}.service"

# --- Begin: Add systemd service creation ---
cat >/usr/lib/systemd/system/mount_parallelstore.service <<EOF
cat >/usr/lib/systemd/system/"${service_name}" <<EOF
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this land in /usr/lib/systemd/system or /etc/systemd/system? See table 1 here. I'd lean towards putting those into /etc/systemd/system

[Unit]
Description=DAOS Mount Service
After=network-online.target daos_agent.service

[Service]
Type=oneshot
Type=simple
User=root
Group=root
ExecStart=/bin/bash -c '$mount_command'
Restart=always
RestartSec=1
ExecStart=/bin/bash -c "$mount_command"
ExecStop=fusermount3 -u '$local_mount'

[Install]
WantedBy=multi-user.target
EOF

systemctl enable mount_parallelstore.service
systemctl enable "${service_name}"
systemctl start "${service_name}"
# --- End: Add systemd service creation ---

exit 0
Loading