Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Deployment retry policy #248

Merged
merged 7 commits into from
Dec 4, 2024
Merged

Conversation

adityachoudhari26
Copy link
Contributor

@adityachoudhari26 adityachoudhari26 commented Dec 3, 2024

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced a "Retry Count" field in the Create and Edit Deployment dialogs, allowing users to specify the number of retry attempts for deployments.
    • Enhanced webhook functionality for GitHub integration to support job retries based on deployment status.
    • Added failure handling for job updates, improving response to job failures.
  • Database Changes

    • Added a new retry_count column to the deployment schema to store the retry count.
    • Updated the job trigger type enum to include a "retry" state for better job management.

These enhancements improve user control over deployment processes and strengthen error handling capabilities.

Copy link
Contributor

coderabbitai bot commented Dec 3, 2024

Walkthrough

This pull request introduces a new field, retryCount, across several components and schemas related to deployment management. The CreateDeploymentDialog, EditDeploymentDialog, and EditDeploymentSection components now include a numeric input for retryCount, allowing users to specify the number of retries for deployments. Additionally, the database schema is updated to accommodate this new field, including changes to the deployment table and related enums. The webhook handler is modified to improve job dispatching and retry logic based on the new field.

Changes

File Change Summary
apps/webservice/src/app/[workspaceSlug]/(app)/_components/CreateDeployment.tsx Added retryCount field to CreateDeploymentDialog, initialized to 0, with validation for non-negative integers.
apps/webservice/src/app/[workspaceSlug]/(app)/_components/EditDeploymentDialog.tsx Introduced retryCount field in EditDeploymentDialog, accepting non-negative integers.
apps/webservice/src/app/[workspaceSlug]/(app)/systems/[systemSlug]/deployments/[deploymentSlug]/EditDeploymentSection.tsx Added retryCount field in EditDeploymentSection, configured for non-negative integer input.
apps/webservice/src/app/api/github/webhook/workflow/handler.ts Updated handleWorkflowWebhookEvent to include handling for job failures using onJobFailure.
packages/db/drizzle/0044_sturdy_silver_centurion.sql Added enum value 'retry' to release_job_trigger_type and new column retry_count to deployment table.
packages/db/drizzle/meta/_journal.json Added new entry for migration index 44 related to the schema changes.
packages/db/src/schema/deployment.ts Added retryCount field to deploymentSchema and updated deployment table definition accordingly.
packages/db/src/schema/release.ts Added enum value "retry" to releaseJobTriggerType.
packages/api/src/router/job.ts Integrated onJobFailure in job update logic for handling job failures.
packages/job-dispatch/src/job-failure.ts Introduced functions dispatchJobsForNewerRelease and onJobFailure for handling job failures.
packages/job-dispatch/src/index.ts Added export for job failure handling functions from job-failure.js.

Possibly related PRs

  • fix: Edit deployment variable #164: The EditDeploymentDialog component introduces a new form field for retryCount, which is similar to the changes made in the CreateDeploymentDialog component in the main PR, both adding functionality for managing retry counts in deployment forms.
  • fix: Release dependencies refactor #165: The CreateRelease component modifies the structure of release dependencies, which may relate to how deployments are managed in the main PR, particularly with the addition of fields that could affect deployment logic.
  • fix: Create jobs for latest valid releases on env create #197: The createReleaseChannels function in the route.ts file enhances the handling of release channels during environment creation, which is relevant to the deployment logic introduced in the main PR.
  • fix: Reuse filter endpoints for deployment resource drawer #239: The modifications to the DeploymentResourceDrawer component to enhance filtering logic for job conditions and the retrieval of release data based on deployment environment and resource are directly related to the changes in the main PR, which also focuses on deployment management.
  • fix: Init runbooks ui #241: The JobTable component's query modifications to include a limit parameter may relate to how deployments are displayed and managed in the context of the changes made in the main PR.
  • fix: Resource associations visualization #243: The modifications to edge creation functions in the edges.ts file enhance the visualization of resource associations, which could be relevant to the deployment logic introduced in the main PR.

Suggested reviewers

  • jsbroks

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (3)
apps/webservice/src/app/api/github/webhook/workflow/handler.ts (2)

109-110: Consider awaiting the dispatchJobsForNewerRelease call

In the conditional check, the function returns the Promise from dispatchJobsForNewerRelease without awaiting it. While returning the Promise is acceptable if handled correctly by the caller, explicitly awaiting the call can improve readability and ensure that any asynchronously thrown errors are caught within this function.

Apply this diff to enhance clarity:

if (latestRelease.id !== jobInfo.release.id)
-    return dispatchJobsForNewerRelease(latestRelease.id);
+    await dispatchJobsForNewerRelease(latestRelease.id);
+    return;

191-192: Ensure consistent handling of async functions

The onJobCompletion function is likely asynchronous. To maintain consistency and ensure that all asynchronous operations are completed, consider awaiting this function call.

Apply this diff:

if (status === JobStatus.Completed)
-    return onJobCompletion(job);
+    return await onJobCompletion(job);
apps/webservice/src/app/[workspaceSlug]/(app)/_components/CreateDeployment.tsx (1)

187-206: Consider enhancing the retry count field with validation and help text

While the implementation is functionally correct, consider these improvements for better user experience:

  1. Add a maximum retry limit to prevent excessive retries
  2. Add help text explaining the retry mechanism
  3. Add validation messages for invalid inputs
 <FormField
   control={form.control}
   name="retryCount"
   render={({ field: { value, onChange } }) => (
     <FormItem>
       <FormLabel>Retry Count</FormLabel>
+      <FormDescription>
+        Number of times to retry the deployment if it fails (max: 5)
+      </FormDescription>
       <FormControl>
         <Input
           type="number"
           min={0}
+          max={5}
           step={1}
           value={value}
           onChange={(e) => onChange(e.target.valueAsNumber)}
           className="w-16"
         />
       </FormControl>
       <FormMessage />
     </FormItem>
   )}
 />
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between b1c6006 and ec7fc59.

📒 Files selected for processing (8)
  • apps/webservice/src/app/[workspaceSlug]/(app)/_components/CreateDeployment.tsx (2 hunks)
  • apps/webservice/src/app/[workspaceSlug]/(app)/_components/EditDeploymentDialog.tsx (1 hunks)
  • apps/webservice/src/app/[workspaceSlug]/(app)/systems/[systemSlug]/deployments/[deploymentSlug]/EditDeploymentSection.tsx (1 hunks)
  • apps/webservice/src/app/api/github/webhook/workflow/handler.ts (3 hunks)
  • packages/db/drizzle/0044_sturdy_silver_centurion.sql (1 hunks)
  • packages/db/drizzle/meta/_journal.json (1 hunks)
  • packages/db/src/schema/deployment.ts (3 hunks)
  • packages/db/src/schema/release.ts (1 hunks)
🔇 Additional comments (9)
apps/webservice/src/app/api/github/webhook/workflow/handler.ts (2)

64-82: New function dispatchJobsForNewerRelease is well-implemented

The dispatchJobsForNewerRelease function correctly retrieves pending job triggers for a specific release and dispatches them while ensuring policy compliance and cancelling old triggers. The implementation appears sound and follows best practices.


84-142: maybeRetryJob function effectively handles retry logic

The maybeRetryJob function appropriately checks for job information, compares release IDs, counts existing job triggers, and conditionally creates and dispatches a new trigger based on the retry count. The logic is clear and the function enhances the robustness of job handling.

packages/db/drizzle/0044_sturdy_silver_centurion.sql (1)

1-1: Verify the impact of adding a new value to an existing enum type

Adding a new value 'retry' to the existing enum "release_job_trigger_type" can have implications on the database, especially if there are long-running transactions or if the enum is used in stored procedures. Ensure that this change is carefully deployed to prevent downtime.

To check for dependencies and usages of the enum type, run:

✅ Verification successful

Adding a new enum value is safe in this context

The codebase analysis shows that:

  1. The enum release_job_trigger_type has been safely extended multiple times before (migrations 0030, 0034, 0036)
  2. The enum is used only in the type column of a table defined in the initial migration
  3. PostgreSQL's ALTER TYPE ... ADD VALUE is a safe operation that doesn't require table rewrites or locks that would block reads

This change follows the established pattern for extending the enum and doesn't pose any risks to database operations or require special deployment considerations.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find all dependencies on the 'release_job_trigger_type' enum.

# Find functions and triggers using the enum
psql -c "\df+ *" | grep -i "release_job_trigger_type"

# Find tables and columns using the enum
psql -c "\d+ *" | grep -i "release_job_trigger_type"

Length of output: 296


Script:

#!/bin/bash
# Search for files containing the enum type name
rg -l "release_job_trigger_type"

# Search for specific usage of the enum type in code
rg "release_job_trigger_type" -A 2 -B 2

# Look for SQL migration files that might have created or modified this enum
fd -e sql | xargs rg "release_job_trigger_type"

Length of output: 42356

packages/db/src/schema/deployment.ts (2)

36-41: Validation for retryCount is appropriate

The addition of the retryCount field with validation to ensure it is a non-negative number is well-implemented. This aligns with the intended functionality and maintains data integrity.


61-61: Database schema update for retryCount

Defining the retryCount column as integer with notNull and a default value of 0 is appropriate. Ensure that the migration script accounts for the potential performance impact on large tables.

apps/webservice/src/app/[workspaceSlug]/(app)/_components/EditDeploymentDialog.tsx (1)

126-143: Consistent handling of retryCount input

Similar to previous components, ensure that NaN values are handled when using e.target.valueAsNumber. This will prevent potential issues with form submission and data integrity.

Refer to the previous comment on handling NaN values in EditDeploymentSection.tsx.

apps/webservice/src/app/[workspaceSlug]/(app)/_components/CreateDeployment.tsx (1)

66-66: LGTM: Default value initialization

The retryCount is correctly initialized to 0 in the form's default values.

packages/db/drizzle/meta/_journal.json (1)

312-318: LGTM: Migration entry follows the established pattern

The new migration entry maintains consistency with the existing structure and sequential ordering.

packages/db/src/schema/release.ts (1)

171-171: LGTM: New trigger type for retry functionality

The addition of the "retry" enum value is well-documented and aligns with the retry functionality being implemented.

Let's verify the enum usage across the codebase:

✅ Verification successful

Let me search for more specific usage patterns to ensure we haven't missed any switch statements or conditional logic handling the trigger types.


Let me try one final search pattern to look for any conditional logic or usage of the trigger types.


Let me check the specific files that might handle the trigger type logic.


Based on the results from all our searches, I can now provide the final response:

LGTM: The "retry" trigger type addition is safe

The verification shows that:

  • The enum addition is properly handled in the database migration (0044_sturdy_silver_centurion.sql)
  • The trigger type is primarily used for type definitions and database schema
  • No switch statements or conditional logic were found that would require updates for the new "retry" value
  • The type is mainly used for database operations and type safety, not for control flow
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for usage of the releaseJobTriggerType enum and "retry" value
echo "Searching for enum usage..."
rg -A 2 "releaseJobTriggerType|release_job_trigger_type.*retry"

# Search for potential switch/case statements that need updating
echo "Checking for switch/case statements..."
ast-grep --pattern 'switch ($expr) { $$$ case "new_release": $$$ }'

Length of output: 1047


Script:

#!/bin/bash
# Search for switch statements with trigger type patterns
ast-grep --pattern 'switch ($expr) {
  $$$
  case $str:
  $$$
}'

# Search for if-else conditions with trigger types
rg -A 3 "if.*type.*=.*['\"]" packages/db/src/

# Search for any other potential usage of trigger types
rg -A 2 "type.*=.*['\"]new_release|type.*=.*['\"]new_resource"

Length of output: 185


Script:

#!/bin/bash
# Search for any usage of trigger types in conditional statements or assignments
rg -A 3 "(type\s*===?\s*['\"]|type:\s*['\"])(new_release|new_resource|retry)"

# Look for any files that might handle release job triggers
rg -l "release.*job.*trigger|trigger.*type"

# Search for enum usage in TypeScript type definitions
ast-grep --pattern 'type $_ = { type: $_ }'

Length of output: 4203


Script:

#!/bin/bash
# Check the job creation and dispatch files for trigger type handling
rg -A 5 "type" packages/job-dispatch/src/job-creation.ts
rg -A 5 "type" packages/job-dispatch/src/job-variables-deployment/job-variables-deployment.ts

# Check the deployment router for trigger type handling
rg -A 5 "type" packages/api/src/router/deployment.ts

# Look for any type definitions or interfaces using the trigger type
ast-grep --pattern 'interface $_ { type: $_ }'

Length of output: 4120

@@ -0,0 +1,2 @@
ALTER TYPE "release_job_trigger_type" ADD VALUE 'retry';--> statement-breakpoint
ALTER TABLE "deployment" ADD COLUMN "retry_count" integer DEFAULT 0 NOT NULL;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Consider the migration strategy for adding a NOT NULL column with a default value

Adding a new NOT NULL column with a default value to a large table can lead to performance issues due to a full table rewrite and locking. To mitigate this, consider adding the column without the NOT NULL constraint and default, backfilling the data, and then altering the column to set the constraint and default.

Apply this refactored migration strategy:

-- Step 1: Add the column as nullable without a default
ALTER TABLE "deployment" ADD COLUMN "retry_count" integer;

-- Step 2: Backfill existing rows with the default value
UPDATE "deployment" SET "retry_count" = 0 WHERE "retry_count" IS NULL;

-- Step 3: Alter the column to set NOT NULL constraint and default value
ALTER TABLE "deployment" ALTER COLUMN "retry_count" SET NOT NULL;
ALTER TABLE "deployment" ALTER COLUMN "retry_count" SET DEFAULT 0;

Comment on lines +120 to +139
<FormField
control={form.control}
name="retryCount"
render={({ field: { value, onChange } }) => (
<FormItem>
<FormLabel>Retry Count</FormLabel>
<FormControl>
<Input
type="number"
value={value}
onChange={(e) => onChange(e.target.valueAsNumber)}
min={0}
step={1}
className="w-16"
/>
</FormControl>
<FormMessage />
</FormItem>
)}
/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Handle potential NaN values from e.target.valueAsNumber

When using e.target.valueAsNumber in the onChange handler, if the input is empty or invalid (e.g., the user deletes the content), it may result in NaN. This could cause issues with form validation and data submission. Consider adding a check to handle NaN values and default to 0 or an empty string.

Apply this diff to handle NaN values:

<FormControl>
  <Input
    type="number"
    value={value}
-   onChange={(e) => onChange(e.target.valueAsNumber)}
+   onChange={(e) => {
+     const newValue = e.target.valueAsNumber;
+     onChange(isNaN(newValue) ? 0 : newValue);
+   }}
    min={0}
    step={1}
    className="w-16"
  />
</FormControl>
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<FormField
control={form.control}
name="retryCount"
render={({ field: { value, onChange } }) => (
<FormItem>
<FormLabel>Retry Count</FormLabel>
<FormControl>
<Input
type="number"
value={value}
onChange={(e) => onChange(e.target.valueAsNumber)}
min={0}
step={1}
className="w-16"
/>
</FormControl>
<FormMessage />
</FormItem>
)}
/>
<FormField
control={form.control}
name="retryCount"
render={({ field: { value, onChange } }) => (
<FormItem>
<FormLabel>Retry Count</FormLabel>
<FormControl>
<Input
type="number"
value={value}
onChange={(e) => {
const newValue = e.target.valueAsNumber;
onChange(isNaN(newValue) ? 0 : newValue);
}}
min={0}
step={1}
className="w-16"
/>
</FormControl>
<FormMessage />
</FormItem>
)}
/>

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (2)
packages/job-dispatch/src/job-failure.ts (2)

93-96: Enhance log messages for better traceability.

Including additional identifiers in log messages can aid in debugging. Consider adding the deployment ID to the log message.

Apply this diff to update the log message:

 logger.info(
-  `Retry job for release ${jobInfo.release.id} and resource ${jobInfo.release_job_trigger.resourceId} created and dispatched.`,
+  `Retry job for deployment ${jobInfo.deployment.id}, release ${jobInfo.release.id}, resource ${jobInfo.release_job_trigger.resourceId} created and dispatched.`,
 );

39-97: Add error handling to ensure exceptions are logged.

Currently, the onJobFailure function does not have explicit error handling. Wrapping the function's logic in a try-catch block will ensure that any exceptions are caught and appropriately logged.

Implement error handling as follows:

export const onJobFailure = async (job: schema.Job) => {
  try {
    // Existing function logic...
  } catch (error) {
    logger.error(`Error handling job failure for job ${job.id}:`, error);
  }
};

This ensures that unexpected errors do not fail silently and are captured in your logs.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between ec7fc59 and fde49d3.

📒 Files selected for processing (5)
  • apps/webservice/src/app/api/github/webhook/workflow/handler.ts (2 hunks)
  • apps/webservice/src/app/api/v1/jobs/[jobId]/route.ts (2 hunks)
  • packages/api/src/router/job.ts (2 hunks)
  • packages/job-dispatch/src/index.ts (1 hunks)
  • packages/job-dispatch/src/job-failure.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • apps/webservice/src/app/api/github/webhook/workflow/handler.ts
🔇 Additional comments (4)
packages/job-dispatch/src/job-failure.ts (1)

19-37: Function dispatchJobsForNewerRelease is implemented correctly.

The function effectively retrieves pending release job triggers for the specified release ID and dispatches them after applying necessary filters.

apps/webservice/src/app/api/v1/jobs/[jobId]/route.ts (1)

165-166: Job failure handling integrated properly.

The onJobFailure function is correctly invoked when the job status is JobStatus.Failure, ensuring that failure handling logic is executed as expected.

packages/api/src/router/job.ts (2)

57-57: onJobFailure function imported successfully.

The import statement for onJobFailure ensures that the function is available for use within this module.


552-558: Invoke onJobFailure when job status updates to failure.

The onJobFailure function is appropriately called when a job's status changes to JobStatus.Failure. This integration enhances the job update handling by managing failure scenarios.

@@ -1,6 +1,7 @@
export * from "./config.js";
export * from "./release-job-trigger.js";
export * from "./job-creation.js";
export * from "./job-failure.js";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Export statement is correct but needs .ts extension

The job-failure.ts module exists and exports the expected onJobFailure function, but the import statement in index.ts incorrectly references it with a .js extension. It should be:

export * from "./job-failure.ts";
🔗 Analysis chain

LGTM! Verify the job-failure module exports.

The new export statement is correctly placed and aligns with the PR's objective of implementing deployment retry policy.

Let's verify the existence and exports of the job-failure module:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the job-failure.js module exists and exports the expected functions

# Check if the file exists and contains the expected exports
ast-grep --pattern 'export async function $func($_) {
  $$$
}' packages/job-dispatch/src/job-failure.js

# Verify there are no syntax errors in the module
cat packages/job-dispatch/src/job-failure.js

Length of output: 313


Script:

#!/bin/bash
# Let's check if the file exists with a different extension or in a different location
fd job-failure

# Let's also check the content of the index.ts file to see all exports
cat packages/job-dispatch/src/index.ts

Length of output: 1099


Script:

#!/bin/bash
# Let's check the content of the actual TypeScript file
cat packages/job-dispatch/src/job-failure.ts

# Also check if there are any references to this module in other files
rg "job-failure" --type ts -g '!*.d.ts' -l

Length of output: 3157

Comment on lines +39 to +97
export const onJobFailure = async (job: schema.Job) => {
const jobInfo = await db
.select()
.from(schema.releaseJobTrigger)
.innerJoin(
schema.release,
eq(schema.releaseJobTrigger.releaseId, schema.release.id),
)
.innerJoin(
schema.deployment,
eq(schema.release.deploymentId, schema.deployment.id),
)
.where(eq(schema.releaseJobTrigger.jobId, job.id))
.then(takeFirstOrNull);

if (jobInfo == null) return;

const latestRelease = await db
.select()
.from(schema.release)
.where(eq(schema.release.deploymentId, jobInfo.deployment.id))
.orderBy(desc(schema.release.createdAt))
.limit(1)
.then(takeFirst);

if (latestRelease.id !== jobInfo.release.id)
return dispatchJobsForNewerRelease(latestRelease.id);

const releaseJobTriggers = await db
.select({ count: count() })
.from(schema.releaseJobTrigger)
.where(eq(schema.releaseJobTrigger.releaseId, jobInfo.release.id))
.then(takeFirst);

const { count: releaseJobTriggerCount } = releaseJobTriggers;

if (releaseJobTriggerCount >= jobInfo.deployment.retryCount) return;

const createTrigger = createReleaseJobTriggers(db, "retry")
.releases([jobInfo.release.id])
.environments([jobInfo.release_job_trigger.environmentId]);

const trigger =
jobInfo.release_job_trigger.causedById != null
? await createTrigger
.causedById(jobInfo.release_job_trigger.causedById)
.insert()
: await createTrigger.insert();

await dispatchReleaseJobTriggers(db)
.releaseTriggers(trigger)
.then(cancelOldReleaseJobTriggersOnJobDispatch)
.dispatch()
.then(() =>
logger.info(
`Retry job for release ${jobInfo.release.id} and resource ${jobInfo.release_job_trigger.resourceId} created and dispatched.`,
),
);
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Refine the retry count logic to consider only failed job triggers.

In the onJobFailure function, when checking whether to retry a failed job, the code counts all job triggers associated with the release:

const releaseJobTriggers = await db
  .select({ count: count() })
  .from(schema.releaseJobTrigger)
  .where(eq(schema.releaseJobTrigger.releaseId, jobInfo.release.id))
  .then(takeFirst);

const { count: releaseJobTriggerCount } = releaseJobTriggers;

if (releaseJobTriggerCount >= jobInfo.deployment.retryCount) return;

This count may include job triggers that are not failures (e.g., successful or pending jobs), which could lead to premature cessation of retries. To ensure accurate retry logic, consider counting only the failed job triggers for this release.

Apply this diff to adjust the count logic:

 const releaseJobTriggers = await db
-  .select({ count: count() })
+  .select({ count: count().as('failedCount') })
   .from(schema.releaseJobTrigger)
+  .innerJoin(schema.job, eq(schema.releaseJobTrigger.jobId, schema.job.id))
   .where(
     and(
       eq(schema.releaseJobTrigger.releaseId, jobInfo.release.id),
+      eq(schema.job.status, JobStatus.Failure),
     ),
   )
   .then(takeFirst);

 const { count: releaseJobTriggerCount } = releaseJobTriggers;

This change ensures that only failed job triggers are counted towards the retry limit.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (1)
packages/job-dispatch/src/job-update.ts (1)

64-70: Consider returning the updated job from updateJob function

Returning the updatedJob from the updateJob function may be beneficial for further processing or chaining in other parts of the codebase.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between fde49d3 and c220e0d.

📒 Files selected for processing (4)
  • apps/webservice/src/app/api/github/webhook/workflow/handler.ts (2 hunks)
  • packages/db/src/schema/job.ts (4 hunks)
  • packages/job-dispatch/src/index.ts (1 hunks)
  • packages/job-dispatch/src/job-update.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • packages/job-dispatch/src/index.ts
  • apps/webservice/src/app/api/github/webhook/workflow/handler.ts
🔇 Additional comments (6)
packages/job-dispatch/src/job-update.ts (2)

10-50: Efficient metadata update in updateJobMetadata function

The updateJobMetadata function effectively handles merging and updating job metadata with proper conflict resolution.


73-81: Correct handling of job status transitions

The implementation correctly handles job status changes, triggering onJobFailure and onJobCompletion when appropriate.

packages/db/src/schema/job.ts (4)

8-8: Importing z from zod for type inference

Good addition to support type inference with z.infer, enhancing type safety.


101-101: Adding metadata relation to jobRelations

Appropriate addition to establish the relationship between job and jobMetadata, facilitating metadata queries.


118-120: Defining jobMetadataRelations to link metadata to jobs

Establishes the inverse relationship, enhancing the schema's consistency and navigability.


131-131: Exporting UpdateJob type using z.infer

Ensures consistent typing for UpdateJob, improving maintainability and reducing potential type errors.

Comment on lines +19 to +23
...JSON.parse(
existingMetadata.find(
(m) => m.key === String(ReservedMetadataKey.Links),
)?.value ?? "{}",
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add error handling for JSON.parse to prevent runtime exceptions

The JSON.parse operation may throw an error if value contains invalid JSON. Consider adding error handling to prevent potential runtime exceptions.

Apply this diff to add error handling:

-const updatedLinks = JSON.stringify({
-  ...JSON.parse(
-    existingMetadata.find(
-      (m) => m.key === String(ReservedMetadataKey.Links),
-    )?.value ?? "{}",
-  ),
-  ...links,
-});
+let existingLinks = {};
+try {
+  existingLinks = JSON.parse(
+    existingMetadata.find(
+      (m) => m.key === String(ReservedMetadataKey.Links),
+    )?.value ?? "{}",
+  );
+} catch (e) {
+  console.error('Invalid JSON in existing links:', e);
+}
+const updatedLinks = JSON.stringify({
+  ...existingLinks,
+  ...links,
+});

Committable suggestion skipped: line range outside the PR's diff.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (2)
apps/webservice/src/app/api/github/webhook/workflow/handler.ts (1)

69-77: Add URL validation for GitHub action run link

The GitHub action run URL construction assumes repository owner and name are always present and valid. Consider adding validation or using URL constructor for safer URL building.

 await updateJob(
   job.id,
   { status, externalId },
   {
     [String(ReservedMetadataKey.Links)]: {
-      GitHub: `https://github.com/${repository.owner.login}/${repository.name}/actions/runs/${id}`,
+      GitHub: new URL(
+        `/actions/runs/${id}`,
+        `https://github.com/${repository.owner.login}/${repository.name}`
+      ).toString(),
     },
   },
 );
apps/webservice/src/app/api/v1/jobs/[jobId]/route.ts (1)

63-91: Consider using type-safe joins

The multiple join operations could benefit from type safety using the schema's type system. Consider using the $inferSelect type to ensure type safety across joins.

type JobJoinResult = {
  job: typeof schema.job.$inferSelect;
  runbook: typeof schema.runbook.$inferSelect | null;
  environment: typeof schema.environment.$inferSelect | null;
  // ... other joined tables
};
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between c220e0d and 66067a9.

📒 Files selected for processing (4)
  • apps/webservice/src/app/api/github/webhook/workflow/handler.ts (2 hunks)
  • apps/webservice/src/app/api/v1/jobs/[jobId]/route.ts (6 hunks)
  • packages/api/src/router/job.ts (14 hunks)
  • packages/job-dispatch/src/job-update.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/api/src/router/job.ts
🔇 Additional comments (1)
packages/job-dispatch/src/job-update.ts (1)

19-23: Add error handling for JSON.parse operation

The JSON.parse operation may throw if the metadata value contains invalid JSON.

Comment on lines +52 to +85
export const updateJob = async (
jobId: string,
data: schema.UpdateJob,
metadata?: Record<string, any>,
) => {
const jobBeforeUpdate = await db.query.job.findFirst({
where: eq(schema.job.id, jobId),
with: { metadata: true },
});

if (jobBeforeUpdate == null) throw new Error("Job not found");

const updatedJob = await db
.update(schema.job)
.set(data)
.where(eq(schema.job.id, jobId))
.returning()
.then(takeFirst);

if (metadata != null)
await updateJobMetadata(jobId, jobBeforeUpdate.metadata, metadata);

const isJobFailure =
data.status === JobStatus.Failure &&
jobBeforeUpdate.status !== JobStatus.Failure;
if (isJobFailure) await onJobFailure(updatedJob);

const isJobCompletion =
data.status === JobStatus.Completed &&
jobBeforeUpdate.status !== JobStatus.Completed;
if (isJobCompletion) await onJobCompletion(updatedJob);

return updatedJob;
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider wrapping updates in a transaction

The function performs multiple database operations (job update and metadata update) that should be atomic. Consider wrapping these operations in a transaction to ensure data consistency.

 export const updateJob = async (
   jobId: string,
   data: schema.UpdateJob,
   metadata?: Record<string, any>,
 ) => {
+  return await db.transaction(async (tx) => {
     const jobBeforeUpdate = await db.query.job.findFirst({
       where: eq(schema.job.id, jobId),
       with: { metadata: true },
     });

     if (jobBeforeUpdate == null) throw new Error("Job not found");

     const updatedJob = await db
       .update(schema.job)
       .set(data)
       .where(eq(schema.job.id, jobId))
       .returning()
       .then(takeFirst);

     if (metadata != null)
       await updateJobMetadata(jobId, jobBeforeUpdate.metadata, metadata);

     const isJobFailure =
       data.status === JobStatus.Failure &&
       jobBeforeUpdate.status !== JobStatus.Failure;
     if (isJobFailure) await onJobFailure(updatedJob);

     const isJobCompletion =
       data.status === JobStatus.Completed &&
       jobBeforeUpdate.status !== JobStatus.Completed;
     if (isJobCompletion) await onJobCompletion(updatedJob);

     return updatedJob;
+  });
 };

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +157 to +159
const job = await updateJob(params.jobId, body);

return NextResponse.json(je);
return NextResponse.json(job);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add error handling for updateJob operation

The PATCH handler should include proper error handling for the updateJob operation, including handling of the "Job not found" error.

-  const job = await updateJob(params.jobId, body);
-
-  return NextResponse.json(job);
+  try {
+    const job = await updateJob(params.jobId, body);
+    return NextResponse.json(job);
+  } catch (error) {
+    if (error instanceof Error && error.message === "Job not found") {
+      return NextResponse.json(
+        { error: "Job not found" },
+        { status: 404 }
+      );
+    }
+    throw error;
+  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const job = await updateJob(params.jobId, body);
return NextResponse.json(je);
return NextResponse.json(job);
try {
const job = await updateJob(params.jobId, body);
return NextResponse.json(job);
} catch (error) {
if (error instanceof Error && error.message === "Job not found") {
return NextResponse.json(
{ error: "Job not found" },
{ status: 404 }
);
}
throw error;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant