Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ROACH trigger setting logging flexibility #202

Open
wcpettus opened this issue Aug 22, 2019 · 2 comments
Open

ROACH trigger setting logging flexibility #202

wcpettus opened this issue Aug 22, 2019 · 2 comments

Comments

@wcpettus
Copy link

We log the time_window_settings and trigger_settings methods of the roach_daq_run_interface. Originally we had separate psyllid and dragonfly daq channel configs for these two settings, and would toggle both in step. Since switching psyllid to kubernetes, the dragonfly daq keeps trying to log an unavailable psyllid method and throws long errors.

Should dragonfly daq provider and psyllid run together in the same pod structure (kubernetes has a model for this), and does this give the right restart behavior?
Can the daq provider method be altered to be more tolerant while not losing sensitivity to failure modes we care about?

Currently it requires operator intervention when changing to the streaming config:
dragonfly set r2_channel_{X}_time_window.schedule_status off -b myrna.p8
dragonfly set r2_channel_{X}_trigger_settings.schedule_status off -b myrna.p8

We don't do streaming very often, so this may be a rare edge case.

@laroque
Copy link
Member

laroque commented Aug 22, 2019

I don't think I've fully understood this issue yet but:

  • If daq provider and psyllid should have coupled lifecycles then it makes sense to put them into the same pod.
  • If one of them crashes, the default behavior in k8s is to restart only that container, not every container in the pod.
  • If the pod specification is changed/updated, the new helm release will replace the entire pod, which means restarting both/all containers

.... I'm not sure if this fully addresses the questions above

@wcpettus
Copy link
Author

does a liveness probe control a pod-level or container-level restart?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants