Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use new unique jobs implementation #32

Merged
merged 1 commit into from
Dec 19, 2024
Merged

Conversation

bgentry
Copy link
Contributor

@bgentry bgentry commented Dec 13, 2024

This moves this library to use the new unique jobs implementation from riverqueue/river#590 and migrates the Sequel driver to use a unified insertion path, allowing bulk inserts to use unique jobs.

Questions:

  1. Now that bulk inserts support unique jobs and can return full rows, should we adjust the return signature of insert_many like we did in River itself? This would allow detection of duplicates even with bulk inserts.
  2. It seems like the client specs are only hitting the sequel driver and not ActiveRecord. I haven't actually fully converted the AR side yet anyway, and I'm not sure if this note is going to turn out to be a blocker. Likewise with whether I can make the unified insert path work well in AR.
  3. While building this I realized I may have neglected an issue in #590, which is that if you attempt to insert duplicate rows in the same query, you'll get a Postgres error about not being able to UPSERT a row from the same query. We could handle this by scanning all rows-to-be-inserted to see if there are duplicate unique params, and we may want to do that in the Go library too for better devex. Thoughts?
  4. I removed code specific to the old path and didn't attempt to provide a migration path in this client library given that the Go library had a sequence of releases which were compatible with either strategy. If we want to do that, it gets a bit more complex, but lmk what you think.

@bgentry bgentry requested a review from brandur December 13, 2024 05:16
data_set = data_set.where(queue: get_params.queue) if get_params.queue
data_set = data_set.where(state: get_params.state) if get_params.state
data_set.first ? to_job_row(data_set.first) : nil
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely nice that this can all go away.

driver/riverqueue-sequel/lib/driver.rb Outdated Show resolved Hide resolved
Copy link
Contributor

@brandur brandur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sending this over! Looking good so far. Does result in some nice clean up in a few places which is great.

Now that bulk inserts support unique jobs and can return full rows, should we adjust the return signature of insert_many like we did in River itself? This would allow detection of duplicates even with bulk inserts.

Yeah, seems like a good idea to keep the API the same as what's in Go. The return values might be useful to someone too.

It seems like the client specs are only hitting the sequel driver and not ActiveRecord. I haven't actually fully converted the AR side yet anyway, and I'm not sure if this note is going to turn out to be a blocker. Likewise with whether I can make the unified insert path work well in AR.

Yeah, I need to get the tests sorted out a little better. I didn't have a fully unified theory around how testing should work, so originally I had it so that the driver specs would test everything end to end using the database, and the client tests would use a mock layer. The mock layer turned out to be so useless for testing anything and was hard to maintain though, so I ended up just getting rid of it in favor of testing end to end in the client specs as well. They arbitrarily use only one driver, but it's probably okay since the driver specs should be testing pretty thoroughly themselves (see spec/driver_shared_examples.rb).

Regarding the note: hmm, that is unfortunate. This might be require some more bushwhacking around AR source code. There's gotta be a way to make it work. With any luck, they might've added some fixes since I wrote that.

While building this I realized I may have neglected an issue in #590, which is that if you attempt to insert duplicate rows in the same query, you'll get a Postgres error about not being able to UPSERT a row from the same query. We could handle this by scanning all rows-to-be-inserted to see if there are duplicate unique params, and we may want to do that in the Go library too for better devex. Thoughts?

Yeah I could see it either way probably. It might be okay just to leave this error to bubble up to the user. I've run into it a number of times during my time working with Go/Postgres over the last four years, and although every time I was annoyed at Postgres for the generic-sounding/kind of unhelpful error, it turned out that I was doing something wrong in my Go. e.g. I wasn't going through a data set which I was about to insert and making it unique, which I should've been doing.

I removed code specific to the old path and didn't attempt to provide a migration path in this client library given that the Go library had a sequence of releases which were compatible with either strategy. If we want to do that, it gets a bit more complex, but lmk what you think.

Yeah, let's just get it fixed without a migration path. It's a little suboptimal, but this library doesn't have enough users to spend a huge amount of effort on that. If people run into problems we'll ask them to upgrade everything to latest.

@bgentry bgentry force-pushed the bg-updated-unique-jobs branch 3 times, most recently from 1fde20c to 07a2f6c Compare December 17, 2024 02:55
@bgentry bgentry marked this pull request as ready for review December 17, 2024 03:02
@bgentry bgentry marked this pull request as draft December 17, 2024 03:35
@bgentry bgentry force-pushed the bg-updated-unique-jobs branch from 07a2f6c to 561cf00 Compare December 17, 2024 03:56
@bgentry bgentry marked this pull request as ready for review December 17, 2024 03:56
@bgentry bgentry requested a review from brandur December 17, 2024 03:56
@bgentry
Copy link
Contributor Author

bgentry commented Dec 17, 2024

Alright @brandur, this should be actually ready now. ActiveRecord stuff seems to work just fine with bulk unique inserts, or at least with a unified insert path.

@bgentry bgentry force-pushed the bg-updated-unique-jobs branch from 561cf00 to 6213718 Compare December 17, 2024 03:57
Copy link
Contributor

@brandur brandur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Maybe add a changelog entry as well for completeness and we'll get this shipped out ASAP.

driver/riverqueue-activerecord/lib/driver.rb Outdated Show resolved Hide resolved
lib/client.rb Outdated Show resolved Hide resolved
unless unique_opts.exclude_kind
unique_key += "&kind=#{insert_params.kind}"
end

if unique_opts.by_args
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Realizing that this PR does not include the expansion of by_args to allow uniqueness by a subset of args. It also does not yet implement the required change to alphabetically sort the json by key name. The latter is required, though the former could conceivably be added later. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I got both of these implemented pretty easily by setting by_args to be either a bool or an array of keys to be extracted from the JSON before sorting.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nice. Well, that makes it easy then.

@bgentry bgentry force-pushed the bg-updated-unique-jobs branch 2 times, most recently from 33e8627 to 66dca34 Compare December 19, 2024 03:10
@bgentry bgentry force-pushed the bg-updated-unique-jobs branch from 66dca34 to 2831d33 Compare December 19, 2024 03:18
@bgentry bgentry requested a review from brandur December 19, 2024 03:18
Copy link
Contributor

@brandur brandur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, looks great. Let's get this shipped!

@bgentry bgentry merged commit 908377b into master Dec 19, 2024
6 checks passed
@bgentry bgentry deleted the bg-updated-unique-jobs branch December 19, 2024 19:22
brandur added a commit that referenced this pull request Dec 20, 2024
Prepare release 0.8.0, mostly containing the changes in #32.
@brandur brandur mentioned this pull request Dec 20, 2024
brandur added a commit that referenced this pull request Dec 20, 2024
Prepare release 0.8.0, mostly containing the changes in #32.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants