You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We store user bearer tokens in compute job configurations in the internal postgres database. In normal operation this wouldn't be an issue as the config that gets sent in from the HTTP request is the value that is put onto the queue to be executed.
The 'abnormal' operation would be on service restart, when the queue is repopulated based on jobs that were marked as queued when the service was shut down. In this case, the config will be pulled from the database to put onto the queue to be executed.
The bearer token is not considered when hashing a job config to get an ID, and the config in postgres is never updated. This means that some expired bearer token from some arbitrary user will be used when making requests to the subsetting and merging APIs.
Root Problem
The root problem here is that the compute service is making use of user auth in the first place. Compute jobs are not tied to users. Every user submitting job config ABC will hit the cache and get the same job ID. Additionally, there is no requesting user at the time the compute jobs are executed. We are forwarding along request auth to do a hacky job at avoiding the implementation cost of a sane mechanism for compute jobs to get data.
Possible Options
A. Quick & Dirty Band-Aid
Keep the existing ugly hack, and add a further hack to update the config in postgres when a compute job is resubmitted. Then the bearer token in the db will most likely still be valid when the service restarts and the job gets requeued.
B. Actually Solve the Problem
Implement a mechanism through which compute jobs can get data from the subsetting and merging APIs without needing some user's bearer token.
Remove the bearer tokens from the compute job configs altogether.
The text was updated successfully, but these errors were encountered:
Effective Problem
We store user bearer tokens in compute job configurations in the internal postgres database. In normal operation this wouldn't be an issue as the config that gets sent in from the HTTP request is the value that is put onto the queue to be executed.
The 'abnormal' operation would be on service restart, when the queue is repopulated based on jobs that were marked as queued when the service was shut down. In this case, the config will be pulled from the database to put onto the queue to be executed.
The bearer token is not considered when hashing a job config to get an ID, and the config in postgres is never updated. This means that some expired bearer token from some arbitrary user will be used when making requests to the subsetting and merging APIs.
Root Problem
The root problem here is that the compute service is making use of user auth in the first place. Compute jobs are not tied to users. Every user submitting job config ABC will hit the cache and get the same job ID. Additionally, there is no requesting user at the time the compute jobs are executed. We are forwarding along request auth to do a hacky job at avoiding the implementation cost of a sane mechanism for compute jobs to get data.
Possible Options
A. Quick & Dirty Band-Aid
Keep the existing ugly hack, and add a further hack to update the config in postgres when a compute job is resubmitted. Then the bearer token in the db will most likely still be valid when the service restarts and the job gets requeued.
B. Actually Solve the Problem
The text was updated successfully, but these errors were encountered: