-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failed to call ‘uploadInputFiles’
if the pod is in different node than kestra worker in EKS
#148
Comments
Update: Kind of found the potential culprit. Before that, I have checked and think of potential scenarios
And the last one is the main cause of filesupload failed. When I add a dummy init container in flow with just delay of 10s, the file upload successfully. My current setup is in EKS, I use Karpenter to scale my nodes up and down dynamically based on node selectors and resources required for the pods. So when the flow got trigger via SQS message it will try to create a pod which should be deployed in a specific node. That specific node is not readily available it will be provisioned by Karpenter and then pod tries to deploy in that newly provisional node. During this the To solve this I have added a dummy init container that just runs and sleep for 10s and then proceed with fileupload conatiner to task pod. sample flow
I am not sure how the backend of readiness check works in the init conatiner but if you can give us access to modify the fileSidecar configuration of certain things like changing sleep or no. of retries. Or Need a better way of finding readiness of node, that would be helpful but for now this trick does the job. I hope this helps. Let me know if you want more information on this. |
@loicmathieu only tagged so you can check 👍 |
Expected Behavior
I have a SQS trigger and when a new message flows into the queue, it will convert into
.jsonl
and pass the file uri asinputFiles
tokubernetes.PodCreate
. The file will be accessed inside the pod and processed.Actual Behaviour
When I pass the
nodeSelectors
andtolerations
to the kubernetes pod which will be deployed into different node (Not same as the kestra-worker deployed). Because of the kestra and task pod is in different node.busy-box
image is failed to upload the file that I am trying to pass it via flow.But When I removed the node selectors and toleration, the inputFile upload works fine as it intended. From my observation it is only failed if kestra and newly creating task pod not in the same node. By the way I use
Karpenter
to scale the EKS nodes up and down dynamically (Just passing the info if it is anything related to it).Steps To Reproduce
the error log for task creating pod and failing
Environment Information
Example flow
The text was updated successfully, but these errors were encountered: