-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #189: Update the Beam jobs wiki page with instructions for how we handle Beam jobs in the Oppia codebase. #191
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -28,6 +28,7 @@ | |
* [Common Beam errors](#common-beam-errors) | ||
* [`'_UnwindowedValues' object is not subscriptable` error](#_unwindowedvalues-object-is-not-subscriptable-error) | ||
* [`_namedptransform is not iterable` error](#_namedptransform-is-not-iterable-error) | ||
* [Guidelines for writing beam jobs](#guidelines-for-writing-beam-jobs) | ||
* [Case studies](#case-studies) | ||
* [Case Study: `SchemaMigrationJob`](#case-study-schemamigrationjob) | ||
|
||
|
@@ -691,6 +692,32 @@ some_values = ( | |
| 'Get values' >> beam.Values() | ||
) | ||
``` | ||
## Guidelines for writing beam jobs | ||
This section provides some general guidelines for writing Beam jobs, which would be particularly helpful for new contributors. | ||
|
||
### Planning a job | ||
* If your beam job includes updating the storage models, make sure to write an audit job. An audit job is similar to your beam job but doesn't make any changes to the datastore. During testing, the audit job is run before the actual beam job to prevent any unwanted changes to the datastore incase of an error. You will be able to find examples in the codebase for the same. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here and elsewhere: capitalize "Beam" There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||
* There should also be a job to verify the changes done by your job. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you explain how to do this (maybe a worked example or a reference)? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The bullet point here still reads as though it's standalone. Maybe put the previous point below it so that it's clear that the previous point is an example of this point? Otherwise, it looks like the previous point is an example of the point before it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @seanlip Do you mean to say I should interchange the points "If your beam job..." and "Consider the example..."? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah, I just mean that the actual example should be after the point that it is supposed to be an example for. If it is an example for line 703 then the line with the example should come after line 703. Does that help clarify? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes it does. Thanks for the clarification @seanlip. |
||
|
||
### Executing jobs | ||
* Refer to code from similar jobs to avoid mistakes. The "Troubleshooting" section in the wiki lists out the common errors encountered while executing jobs. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Drop "out the" There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||
* Make sure to write tests for all jobs. These tests should check the behaviour of the job for different cases of inputs. The common cases for all jobs are listed below: | ||
* Empty input. | ||
* Input which triggers the job to perform its intended action. | ||
* Incorrect input which causes the job to fail. | ||
* The tests should ensure that the job runs appropirately for all such cases. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "appropriately" is misspelled There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||
|
||
### PR guidelines | ||
* A PR for a beam job should always have "Proof of work" in the description which should include the successful local run of the beam job on the release coordinator page and a screenshot of the storage model which is being changed. The storage models can be checked [locally](https://github.com/oppia/oppia/wiki/Debugging-datastore-locally) via dsadmin. | ||
* A testing request for the job must be submitted once it is approved by all reviewers. Make sure to be clear in your instructions (in the doc) about what exactly needs to be done by the tester. Also, the doc link should be included in the PR description. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you explain or point to info on how to actually submit the request? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||
|
||
### Job testing workflow | ||
* As soon as a PR is approved by all reviewers, a testing doc should be created. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This seems to follow on from the previous section. Might it be better to combine them, and have the following points as sub-bullets of the "testing request" main bullet point? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Makes sense. Done. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shouldn't the testing doc be created along with the initial review of the PR and be reviewed immediately as well? We can save the actual job run until the PR has been approved by others, but I don't see a reason to delay the creation of the documentation (and in fact I think it's needed for being able to review the PR properly in the first place). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes you are right. Done |
||
* The sample job testing [template](https://docs.google.com/document/d/1Xg0MSIpYUEax8dqH39CKhgqt30f-kr8wxO2F4aSxwz4/edit) should be used as a starting point for the testing doc. Make sure to make a copy before editing the doc. | ||
* In case of multiple jobs to be tested, they should be written in the order of testing. Generally, an audit job is run before the migration job to verify that the job runs as intended. | ||
* The instructions should be concise to prevent any confusion for the tester. Any pre and post checks to be done should also be specified clearly. | ||
* After the doc is complete, request a review from the main code reviewer of the related PR. The reviewer should answer the relevant questions at the beginnig of the doc after completing their review. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. beginnig --> beginning Maybe change the last sentence to: "The reviewer should fill in the approval section at the top of the doc." There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||
* The assigned server admin will then test the PR according to the instructions in the doc and update the PR author with the results. In case of errors,the server admin would provide the relevant logs for debugging. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe include the trigger condition at the start: Once approvals have been received, the ... Add a space after "In case of errors," There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||
|
||
## Case Studies | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
incase --> in case
Can you point to the examples in the codebase directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done