-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent data persistence configuration #387
Comments
One inconsistency I found is as follows: birdhouse-deploy/birdhouse/components/weaver/docker-compose-extra.yml Lines 69 to 78 in 33042e6
Because Weaver requires same path locations # /var/lib/docker/volumes/birdhouse_wps_outputs/_data/
finch/
hummingbird/
raven/
weaver/ The birdhouse-deploy/birdhouse/components/cowbird/config/cowbird/config.yml.template Line 114 in 33042e6
|
I didn't use any overrides while testing those, and didn't find any issues. Note that I didn't only tested using Magpie/Cowbird/Jupyterhub and didn't validate the birds that produce wpsoutputs data. I only used fake wpsoutputs data to validate my features. I probably didn't have a problem since Cowbird uses a direct path for the |
Other use cases:
birdhouse-deploy/birdhouse/config/jupyterhub/docker-compose-extra.yml Lines 33 to 34 in 33042e6
birdhouse-deploy/birdhouse/config/thredds/docker-compose-extra.yml Lines 21 to 24 in 33042e6
Which causes (by default) for Cowbird to handle "correctly" permission sync requests within |
Note that a |
Can we also add to the discussion: Why are we creating some named volumes outside of the I'm talking specifically about:
I think it's also confusing that we are treating these volumes differently from named volumes in docker compose and bind-mounts. |
Yes, we can add that as well.
In the case of Not sure about the others. |
@mishaschwartz Error response from daemon: failed to mount local volume: mount /data/thredds:/var/lib/docker/volumes/birdhouse_thredds_persistence/_data, flags: 0x1000: no such file or directory Although the docker-compose config seems to indicate what I expect: volumes:
thredds_persistence:
driver_opts:
device: /data/thredds
o: bind
type: none ... I believe the pre-creation of Using the same strategy as @tlvu Any idea about this comment:
As @mishaschwartz pointed out, unless --volumes is added, even pavics-compose down does not remove this volume, and it also stays even when it is not attached anymore to any service. Could it be an old docker behaviour, or some other cleanup manipulation that caused the mentioned deletion of data?
reference: https://gitlab.com/crim.ca/clients/daccs/daccs-configs/-/merge_requests/28 |
First off, thanks @fmigneault for opening this issue. This is exactly what I was worried about in my comment #360 (comment). I felt the concern I raised about the location of Now I will try to answer the various questions, ping me if I forget some. The All the external data-volume ( So if data-volume is great, why do we need So the decisions are purely opportunistic. I have no objections if we want to standardize all the external data-volume ( For Cowbird, maybe we need to migrate I think for the moment, the cheapest route is to have a full end-to-end test with Cowbird inside a real PAVICS, with a real bird actually generate a wps output for Cowbird to perform the hardlink. If we are lucky and we can work out some magic to make this works, then no |
One problem with the current default setup is that there are multiple "wps_outputs", since some are referenced by named-volumes (finch, raven, hummingbird), while others are For the external data-volumes, having them with explicit drive paths would persist the contents. Indeed, a For Cowbird |
@fmigneault They don't, for now. But Cowbird will eventually be in the default "minimal" list of components enabled automatically. I guess, as you said, having Cowbird enabled but miss configured is fine since it is unused. |
Description
According to the following definition:
birdhouse-deploy/birdhouse/default.env
Lines 11 to 12 in 33042e6
Setting this parameter seem to indicate that the server would be configured such that data persistence is applied.
However, unless something more is defined to override
volumes
, the following definitions are applied by default:https://github.com/bird-house/birdhouse-deploy/blob/master/birdhouse/config/data-volume/docker-compose-extra.yml
https://github.com/bird-house/birdhouse-deploy/blob/master/birdhouse/config/wps_outputs-volume/docker-compose-extra.yml
Therefore, the
data
andwps_outputs
volumes are not "persisted" at all. They reside in a temporary volume created by docker-compose, which could be wiped on anydown
operation. (note: here it is assumed that "persistence" means that the data lives across server recreation followingdown
/up
. If the server crashes andDATA_PERSIST_ROOT
is not itself mounted from another network/backup location, the data could still be lost).One way to define volume overrides to define "persisted" data is as follows (assuming the default values are applied):
Questions
Should we help users setup these definitions by default (instead of the currently misleading configuration)?
driver_opts
can differ for different setups depending on the actual drive/mount strategy. The sample config above is therefore not necessarily applicable for all use cases.EXTRA_CONFS_DIR
)?DATA_PERSIST_ROOT
variable (and others)?How does Cowbird and WPS-outputs/user-workspace synchorinisation behaves without these overrides? @ChaamC?
Other considerations for existing servers / deployment procedures? @mishaschwartz @tlvu ?
The text was updated successfully, but these errors were encountered: