All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- The default max attempts of 25 can now be customized on a per-client basis using
Config.MaxAttempts
. This is in addition to the ability to customize at the job type level withJobArgs
, or on a per-job basis usingInsertOpts
. PR #383. - Add
JobDelete
/JobDeleteTx
APIs onClient
to allow permanently deleting any job that's not currently running. PR #390.
- Fix
StopAndCancel
to not hang if called in parallel to an ongoingStop
call. PR #376.
- River now considers per-worker timeout overrides when rescuing jobs so that jobs with a long custom timeout won't be rescued prematurely. PR #350.
- River CLI now exits with status 1 in the case of a problem with commands or flags, like an unknown command or missing required flag. PR #363.
- Fix migration version 4 (from 0.5.0) so that the up migration can be re-run after it was originally rolled back. PR #364.
RequireNotInserted
test helper (in addition to the existingRequireInserted
) that verifies that a job with matching conditions was not inserted. PR #237.
- The periodic job enqueuer now sets
scheduled_at
of inserted jobs to the more precise time of when they were scheduled to run, as opposed to when they were inserted. PR #341.
- Remove use of
github.com/lib/pq
, making it once again a test-only dependency. PR #337.
-
Add
pending
job state. This is currently unused, but will be used to build higher level functionality for staging jobs that are not yet ready to run (for some reason other than their scheduled time being in the future). Pending jobs will never be run or deleted and must first be moved to another state by external code. PR #301. -
Queue status tracking, pause and resume. PR #301.
A useful operational lever is the ability to pause and resume a queue without shutting down clients. In addition to pause/resume being a feature request from #54, as part of the work on River's UI it's been useful to list out the active queues so that they can be displayed and manipulated.
A new
river_queue
table is introduced in the v4 migration for this purpose. Upon startup, every producer in each RiverClient
will make anUPSERT
query to the database to either register the queue as being active, or if it already exists it will instead bump the timestamp to keep it active. This query will be run periodically in each producer as long as theClient
is alive, even if the queue is paused. A separate query will delete/purge any queues which have not been active in awhile (currently fixed to 24 hours).QueuePause
andQueueResume
APIs have been introduced toClient
pause and resume a single queue by name, or all queues using the special*
value. Each producer will watch for notifications on the relevantLISTEN/NOTIFY
topic unless operating in poll-only mode, in which case they will periodically poll for changes to their queue record in the database.
-
Job insert notifications are now handled within application code rather than within the database using triggers. PR #301.
The initial design for River utilized a trigger on job insert that issued notifications (
NOTIFY
) so that listening clients could quickly pick up the work if they were idle. While this is good for lowering latency, it does have the side effect of emitting a large amount of notifications any time there are lots of jobs being inserted. This adds overhead, particularly to high-throughput installations.To improve this situation and reduce overhead in high-throughput installations, the notifications have been refactored to be emitted at the application level. A client-level debouncer ensures that these notifications are not emitted more often than they could be useful. If a queue is due for an insert notification (on a particular Postgres schema), the notification is piggy-backed onto the insert query within the transaction. While this has the impact of increasing insert latency for a certain percentage of cases, the effect should be small.
Additionally, initial releases of River did not properly scope notification topics within the global
LISTEN/NOTIFY
namespace. If two River installations were operating on the same Postgres database but within different schemas (search paths), their notifications would be emitted on a shared topic name. This is no longer the case and all notifications are prefixed with a{schema_name}.
string. -
Add
NOT NULL
constraints to the database forriver_job.args
andriver_job.metadata
. Normal code paths should never have allowed for null values any way, but this constraint further strengthens the guarantee. PR #301. -
Stricter constraint on
river_job.finalized_at
to ensure it is only set when paired with a finalized state (completed, discarded, cancelled). Normal code paths should never have allowed for invalid values any way, but this constraint further strengthens the guarantee. PR #301.
- Update job state references in
./cmd/river
and some documentation torivertype
. Thanks Danny Hermes (@dhermes)! 🙏🏻 PR #315.
- Breaking change: There are a number of small breaking changes in the job list API using
JobList
/JobListTx
:- Now support querying jobs by a list of Job Kinds and States. Also allows for filtering by specific timestamp values. Thank you Jos Kraaijeveld (@thatjos)! 🙏🏻 PR #236.
- Job listing now defaults to ordering by job ID (
JobListOrderByID
) instead of a job timestamp dependent on on requested job state. The previous ordering behavior is still available withNewJobListParams().OrderBy(JobListOrderByTime, SortOrderAsc)
. PR #307. - The function
JobListCursorFromJob
no longer needs a sort order parameter. Instead, sort order is determined based on the job list parameters that the cursor is subsequently used with. PR #307.
- Breaking change: Client
Insert
andInsertTx
functions now return aJobInsertResult
struct instead of aJobRow
. This allows the result to include metadata like the newUniqueSkippedAsDuplicate
property, so callers can tell whether an inserted job was skipped due to unique constraint. PR #292. - Breaking change: Client
InsertMany
andInsertManyTx
now return number of jobs inserted asint
instead ofint64
. This change was made to make the type in use a little more idiomatic. PR #293. - Breaking change:
river.JobState*
type aliases have been removed. All job state constants should be accessed throughrivertype.JobState*
instead. PR #300.
See also the 0.4.0 release blog post with code samples and rationale behind various changes.
- The River client now supports "poll only" mode with
Config.PollOnly
which makes it avoid issuingLISTEN
statements to wait for new events like a leadership resignation or new job available. The program instead polls periodically to look for changes. A leader resigning or a new job being available will be noticed less quickly, butPollOnly
potentially makes River operable on systems without listen/notify support, like PgBouncer operating in transaction pooling mode. PR #281. - Added
rivertype.JobStates()
that returns the full list of possible job states. PR #297.
- New periodic jobs can now be added after a client's already started using
Client.PeriodicJobs().Add()
and removed withRemove()
. PR #288.
- The level of some of River's common log statements has changed, most often demoting
info
statements todebug
so thatinfo
-level logging is overall less verbose. PR #275.
- Fixed a bug in the (log-only for now) reindexer service in which it might repeat its work loop multiple times unexpectedly while stopping. PR #280.
- Periodic job enqueuer now bases next run times on each periodic job's last target run time, instead of the time at which the enqueuer is currently running. This is a small difference that will be unnoticeable for most purposes, but makes scheduling of jobs with short cron frequencies a little more accurate. PR #284.
- Fixed a bug in the elector in which it was possible for a resigning, but not completely stopped, elector to reelect despite having just resigned. PR #286.
Although it comes with a number of improvements, there's nothing particularly notable about version 0.1.0. Until now we've only been incrementing the patch version given the project's nascent nature, but from here on we'll try to adhere more closely to semantic versioning, using the patch version for bug fixes, and incrementing the minor version when new functionality is added.
- The River CLI now supports
river bench
to benchmark River's job throughput against a database. PR #254. - The River CLI now has a
river migrate-get
command to dump SQL for River migrations for use in alternative migration frameworks. Use it likeriver migrate-get --up --version 3 > version3.up.sql
. PR #273. - The River CLI's
migrate-down
andmigrate-up
options get two new options for--dry-run
and--show-sql
. They can be combined to easily run a preflight check on a River upgrade to see which migration commands would be run on a database, but without actually running them. PR #273. - The River client gets a new
Client.SubscribeConfig
function that lets a subscriber specify the maximum size of their subscription channel. PR #258.
- River uses a new job completer that batches up completion work so that large numbers of them can be performed more efficiently. In a purely synthetic (i.e. mostly unrealistic) benchmark, River's job throughput increases ~4.5x. PR #258.
- Changed default client IDs to be a combination of hostname and the time which the client started. This can still be changed by specifying
Config.ID
. PR #255. - Notifier refactored for better robustness and testability. PR #253.
- Fixed a problem in
riverpgxv5
'sListener
where it wouldn't unset an internal connection ifClose
returned an error, making the listener not reusable. Thanks @mfrister for pointing this one out! PR #246.
- Fixed a memory leak caused by not always cancelling the context used to enable jobs to be cancelled remotely. PR #243.
JobListParams.Kinds()
has been added so that jobs can now be listed by kind. PR #212.
- The underlying driver system's been entirely revamped so that River's non-test code is now decoupled from
pgx/v5
. This will allow additional drivers to be implemented, although there are no additional ones for now. PR #212.
- Fixed a memory leak caused by allocating a new random source on every job execution. Thank you @shawnstephens for reporting ❤️ PR #240.
- Fix a problem where
JobListParams.Queues()
didn't filter correctly based on its arguments. PR #212. - Fix a problem in
DebouncedChan
where it would fire on its "out" channel too often when it was being signaled continuousy on its "in" channel. This would have caused work to be fetched more often than intended in busy systems. PR #222.
- Brings in another leadership election fix similar to #217 in which a TTL equal to the elector's run interval plus a configured TTL padding is also used for the initial attempt to gain leadership (#217 brought it in for reelection only). PR #219.
- Tweaked behavior of
JobRetry
so that it does actually update theScheduledAt
time of the job in all cases where the job is actually being rescheduled. As before, jobs which are already available with a pastScheduledAt
will not be touched by this query so that they retain their place in line. PR #211.
- Fixed a leadership re-election issue that was exposed by the fix in #199. Because we were internally using the same TTL for both an internal timer/ticker and the database update to set the new leader expiration time, a leader wasn't guaranteed to successfully re-elect itself even under normal operation. PR #217.
- Added an
ID
setting to theClient
Config
type to allow users to override client IDs with their own naming convention. Expose the client ID programatically (in case it's generated) in a newClient.ID()
method. PR #206.
- Fix a leadership re-election query bug that would cause past leaders to think they were continuing to win elections. PR #199.
- Added
JobGet
andJobGetTx
to theClient
to enable easily fetching a single job row from code for introspection. [PR #186]. - Added
JobRetry
andJobRetryTx
to theClient
to enable a job to be retried immediately, even if it has already completed, been cancelled, or been discarded. [PR #190].
- Validate queue name on job insertion. Allow queue names with hyphen separators in addition to underscore. PR #184.
- Remove a debug statement from periodic job enqueuer that was accidentally left in. PR #176.
- Added
JobCancel
andJobCancelTx
to theClient
to enable cancellation of jobs. PR #141 and PR #152. - Added
ClientFromContext
andClientWithContextSafely
helpers to extract theClient
from the worker's context where it is now available to workers. This simplifies making the River client available within your workers for i.e. enqueueing additional jobs. PR #145. - Add
JobList
API for listing jobs. PR #117. - Added
river validate
command which fails with a non-zero exit code unless all migrations are applied. PR #170.
- For short
JobSnooze
times (smaller than the scheduler's run interval) put the job straight into anavailable
state with the specifiedscheduled_at
time. This avoids an artificially long delay waiting for the next scheduler run. PR #162.
- Fixed incorrect default value handling for
ScheduledAt
option withInsertMany
/InsertManyTx
. PR #149. - Add missing
t.Helper()
calls inrivertest
internal functions that caused it to report itself as the site of a test failure. PR #151. - Fixed problem where job uniqueness wasn't being respected when used in conjuction with periodic jobs. PR #168.
- Calls to
Stop
error if the client hasn't been started yet. PR #138.
- Fix typo in leadership resignation query to ensure faster new leader takeover. PR #134.
- Elector now uses the same
log/slog
instance configured by its parent client. PR #137. - Notifier now uses the same
log/slog
instance configured by its parent client. PR #140.
- Ensure
ScheduledAt
is respected onInsertManyTx
. PR #121.
- River CLI
go.sum
entries fixed for 0.0.13 release.
- Added
riverdriver/riverdatabasesql
driver to enable River Go migrations through Go's built indatabase/sql
package. PR #98.
- Errored jobs that have a very short duration before their next retry (<5 seconds) are set to
available
immediately instead of being madescheduled
and having to wait for the scheduler to make a pass to make them workable. PR #105. riverdriver
becomes its own submodule. It contains types thatriverdriver/riverdatabasesql
andriverdriver/riverpgxv5
need to reference. PR #98.- The
river/cmd/river
CLI has been made its own Go module. This is possible now that it uses the exportedriver/rivermigrate
API, and will help with project maintainability. PR #107.
- Added
river/rivermigrate
package to enable migrations from Go code as an alternative to using the CLI. PR #67.
Stop
andStopAndCancel
have been changed to respect the provided context argument. When that context is cancelled or times out, those methods will now immediately return with the context's error, even if the Client's shutdown has not yet completed. Apps may need to adjust their graceful shutdown logic to account for this. PR #79.
NewClient
no longer errors if it was provided a workers bundle with zero workers. Instead, that check's been moved toClient.Start
instead. This allows adding workers to a bundle that'd like to reference a River client by lettingAddWorker
be invoked after a client reference is available fromNewClient
. PR #87.
- Added
Example_scheduledJob
, demonstrating how to schedule a job to be run in the future. - Added
Stopped
method toClient
to make it easier to wait for graceful shutdown to complete.
- Fixed a panic in the periodic job enqueuer caused by sometimes trying to reset a
time.Ticker
with a negative or zero duration. Fixed in PR #73.
DefaultClientRetryPolicy
: calculate the next attempt based on the current time instead of the time the prior attempt began.
- DATABASE MIGRATION: Database schema v3 was introduced in v0.0.8 and contained an obvious flaw preventing it from running against existing tables. This migration was altered to execute the migration in multiple steps.
- License changed from LGPLv3 to MPL-2.0.
- DATABASE MIGRATION: Database schema v3, alter river_job tags column to set a default of
[]
and add not null constraint.
- Constants renamed so that adjectives like
Default
andMin
become suffixes instead of prefixes. So for example,DefaultFetchCooldown
becomesFetchCooldownDefault
. - Rename
AttemptError.Num
toAttemptError.Attempt
to better fit with the name ofJobRow.Attempt
. - Document
JobState
,AttemptError
, and all fields its fields. - A
NULL
tags value read from a database job is left as[]string(nil)
onJobRow.Tags
rather than a zero-element slice of[]string{}
.append
andlen
both work on anil
slice, so this should be functionally identical.
JobRow
,JobState
, and a other related types move intoriver/rivertype
so they can more easily be shared amongst packages. Most of the River API doesn't change becauseJobRow
is embedded onriver.Job
, which doesn't move.
- Remove
replace
directive from the project'sgo.mod
so that it's possible to install River CLI with@latest
.
- Allow River clients to be created with a driver with
nil
database pool for use in testing. - Update River test helpers API to use River drivers like
riverdriver/riverpgxv5
to make them agnostic to the third party database package in use. - Document
Config.JobTimeout
's default value. - Functionally disable the
Reindexer
queue maintenance service. It'd previously only operated on currently unused indexes anyway, indexes probably do not need to be rebuilt except under fairly rare circumstances, and it needs more work to make sure it's shored up against edge cases like indexes that fail to rebuild before a clien restart.
- Fix license detection issues with
riverdriver/riverpgxv5
submodule. - Ensure that river requires the
riverpgxv5
module with the same version.
- Pin own
riverpgxv5
dependency to v0.0.1 and make it a direct locally-replaced dependency. This should allow projects to import versioned deps of both river andriverpgxv5
.
- This is the initial prerelease of River.