Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

queue.put() inside a transaction sets enqueued_at to the transaction start time, not the current time #65

Open
miggec opened this issue Mar 22, 2021 · 4 comments

Comments

@miggec
Copy link

miggec commented Mar 22, 2021

Expected Behavior

I'd expect enqueued_at to represent the time at which the task was put into the queue table (e.g. CLOCK_TIMESTAMP(), since that is less arbitrary than the start time of the transaction (CURRENT_TIMESTAMP).

Actual Behavior

enqueued_at is set to the transaction start time by default.

Steps to Reproduce the Problem

with q as cursor:
    thing = q.get()
    # do some processing
    time.sleep(3)
    # put the task back on the queue
    q.put(thing.data)  # timestamp here is the transaction start time, not the current time

Specifications

  • Version: 1.9.0
  • Python version: 3.8
  • PostgreSQL version: 10.10
@malthe
Copy link
Owner

malthe commented Mar 23, 2021

I can see why that might be more ideal, but to motivate this, do you have a specific situation where this is a problem? If we change the behavior, how would we motivate/document this change in the change log?

@miggec
Copy link
Author

miggec commented Mar 25, 2021

Thanks - ultimately I defer to you as to whether the behaviour should be changed or not by default, as it's fairly trivial to override with a bit of SQL.

My use case that motivated this report: I want to perpetually process some tasks, and then re-queue them to be processed again in the near future. I dequeue and enqueue within the same transaction to be sure that the task is always in my queue. I expect to have hundreds of tasks and a handful of workers, so tasks may sit in the queue for a few minutes or more.

I want to measure and report on the length of time that tasks spend sitting in the queue, so that I can do something about it if that becomes too long. But for that I need an accurate enqueue time, otherwise I am also measuring the length of time taken by my task.

The motivation statement in my view: the enqueued_at time should be the time a task was enqueued. The start of a transaction is an implementation detail that has little relevance to that concept, rendering the timestamp somewhat arbitrary in this case.

@malthe
Copy link
Owner

malthe commented Mar 25, 2021

@stas I think this makes sense.

@stas
Copy link
Collaborator

stas commented Mar 26, 2021

Oh, yup it does, even though it sounds quite specific.

Generally I don't think this should affect end-users too much, otherwise this issue would have been brought up already 😂
I'm totally ok with changing this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants