From fe4d31b58f71662f23bc6f57bef117e378f67344 Mon Sep 17 00:00:00 2001 From: Ashwath Kannan Date: Thu, 3 Oct 2024 18:49:29 +0530 Subject: [PATCH 1/9] Draft server error --- ...the-reproduction-steps-for-server-error.md | 974 ++++++++++++++++++ 1 file changed, 974 insertions(+) create mode 100644 Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md diff --git a/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md b/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md new file mode 100644 index 00000000..1e32cd42 --- /dev/null +++ b/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md @@ -0,0 +1,974 @@ +## Introduction + +This tutorial will guide you through debugging a server error that is challenging to reproduce locally. Specifically, we will investigate and fix a TypeError related to certificate generation for contributors. + + +## Skills Covered + +- Codebase Navigation +- Identifying and Analyzing Error Logs +- Debugging Techniques +- Reproducing Server Errors Locally + + +## Scenario + +One of the server admins has reported the following error logs. Your task is to investigate the issue and determine how and why it is occurring. + +> **Note:** The primary goal of this tutorial is not to find a solution, but to guide you through the process of investigating and understanding the workflow of debugging server errors. In this tutorial, you will follow the steps a developer might take to investigate this server error. + +```python +TypeError: expected string or bytes-like object + +Exception raised: expected string or bytes-like object +Traceback (most recent call last): + File "/layers/google.python.pip/pip/lib/python3.8/site-packages/webapp2.py", line 604, in dispatch + return method(*args, **kwargs) + File "/workspace/core/controllers/acl_decorators.py", line 4788, in test_can_fetch_all_contributor_dashboard_stats + return handler(self, username, **kwargs) + File "/workspace/core/controllers/contributor_dashboard.py", line 1062, in get + response = suggestion_services.generate_contributor_certificate_data( + File "/workspace/core/domain/suggestion_services.py", line 3870, in generate_contributor_certificate_data + data = _generate_translation_contributor_certificate_data( + File "/workspace/core/domain/suggestion_services.py", line 3945, in _generate_translation_contributor_certificate_data + plain_text = _get_plain_text_from_html_content_string( + File "/workspace/core/domain/suggestion_services.py", line 1408, in _get_plain_text_from_html_content_string + html_content_string_with_rte_tags_replaced = re.sub( + File "/layers/google.python.runtime/python/lib/python3.8/re.py", line 210, in sub + return _compile(pattern, flags).sub(repl, string, count) +TypeError: expected string or bytes-like object +``` + +## Prerequisites + +Before you begin, make sure you have: + +- Set up your development environment. (If you haven't, follow the [Oppia setup instructions](https://github.com/oppia/oppia/wiki/Installing-Oppia).) +- Familiarize yourself with [Debugging at Oppia wiki page.](https://github.com/oppia/oppia/wiki/Debugging). +- Before starting this tutorial, it is important to have a basic understanding of the following key concepts: + - **Exploration**: Familiarize yourself with explorations by reviewing the [Oppia User Documentation on Explorations](https://oppia.github.io/#/KeyConceptsInOppia). + - **Topic**: Learn about topics by visiting the [relevant wiki page](https://github.com/oppia/oppia/wiki/How-to-access-Oppia-webpages#overview-of-entities). + - These concepts are fundamental to understanding how content is structured in Oppia, and they will help you follow the steps in this tutorial more easily. + +## Procedure + +> The following steps illustrate how a developer might tackle this issue. Try following this tutorial step-by-step on your local machine! This will give you a better sense of how to tackle other similar issues in the codebase. If you get stuck with a step in this tutorial, raise an issue in (GitHub Discussions)[https://github.com/oppia/oppia/discussions/categories/tutorial-feedback] to get help. + +> **Important:** When you see a “practice question box”, stop and try to figure out the answer on your own before reading ahead. You will learn more if you try to figure out your own answer to the question first! + + + +## Stage 1: Understand the Architecture of Beam Jobs at Oppia + +Beam jobs at Oppia are typically structured in a modular fashion, where the main components include: + +- **Pipeline Definition**: This is where you define the structure of your Beam pipeline. +- **Transforms**: These are the processing steps that operate on the data. +- **IO**: Input and output operations to read from and write to data sources. +- **Testing**: Unit tests to ensure the job is correct. + +Refer to the [Beam Jobs at Oppia documentation](https://github.com/oppia/oppia/wiki/Apache-Beam-Jobs#writing-apache-beam-jobs) for a detailed overview. + +## Stage 2: Understand the Models + +> [!IMPORTANT] +> **Practice 1: Locate the definition of `TopicModel` and `TopicSummaryModel`.** +> You can utilize the quick search functionality of your code editor to facilitate this process. +> **Hint**: Take a look at this wiki page on [Tips for common IDEs](https://github.com/oppia/oppia/wiki/Tips-for-common-IDEs#visual-studio-code). + +Before diving into the implementation, it's crucial to understand the models we will be working with. + +
+Topic Model (oppia/core/storage/topic/gae_models.py) + +```python +class TopicModel(base_models.VersionedModel): + """Model for storing Topics. + This class should only be imported by the topic services file and the topic model test file. + """ +``` +
+ +The TopicModel is used for storing detailed information about topics. This model inherits from Versioned Model and thus is versioned, meaning it maintains a history of changes made to each instance. If you are curious about how versions are maintained and want to take a proper look, take a look at the VersionedModel definition in file oppia/core/storage/base_model/gae_models.py. + +
+Topic Summary Model (oppia/core/storage/topic/gae_models.py) + +```python +class TopicSummaryModel(base_models.BaseModel): + """Summary model for an Oppia Topic. + This should be used whenever the content blob of the topic is not needed (e.g. search results, etc). + The key of each instance is the topic id. + """ + +``` +
+ +The TopicSummaryModel provides a summarized view of a topic, which is useful for scenarios where the full details of a topic’s contents are not required, such as search results. Each TopicSummaryModel instance is identified by the topic id. + +## Stage 3: Draft the Directed Acyclic Graph (DAG) + +Before we write any code, let's visualize the workflow of our Beam job as a Directed Acyclic Graph (DAG). This will help us understand the sequence of operations and the flow of data through the pipeline. + +#### What is a DAG? +Like all graphs, a directed acyclic graph (DAG) consists of nodes connected by edges. In this case, the nodes are steps in the job, and the edges indicate the order in which to complete the steps. The edges are thus directional (hence "directed"), and the graph isn't allowed to have any cycles (hence "acyclic"). In other words, it should be impossible to start at one node and follow the edges back to the same node, as this would create an infinite loop in our job. + +For more detailed information about DAGs, you can refer to the [DAG Wikipedia](https://en.wikipedia.org/wiki/Directed_acyclic_graph) page. + +Visualizing our Beam job as a DAG helps in planning the structure and flow of our data processing pipeline. It provides a clear picture of how data moves from one step to another, ensuring that all necessary operations are performed in the correct order. + +### Step 3.1: Define the Job's Objective + +> [!IMPORTANT] +> Practice 2: Take a notebook and try drafting a rough workflow of what our job would do, using boxes for the steps and arrows to connect different steps. +> Hint: +> - Read Everything First: Start by reading all the necessary data at the beginning of the job. This ensures that you have all the required information before performing any operations. +> - Process Data in Steps: Break down the job's functionality into simpler steps, such as filtering, transforming, and aggregating the data. Each step should be a separate node in your DAG. +> - Write Everything Last: Ensure that all writing operations, such as saving results or updating models, are performed at the end of the job. This helps in maintaining data consistency and avoids incomplete writes. + +The job's objective is to validate that each topic model in the datastore has a corresponding topic summary model. The workflow can be broken down into the following steps: + +- **Read Topic Models**: Read all the topic models from the datastore. +- **Validate Topic Models**: For each topic model, check if there is a corresponding topic summary model. +- **Output Results**: Write the validation results to an output sink (e.g., a text file). + +### Step 3.2: Draw the Directed Acyclic Graph (DAG) + +Here's a simple representation of the DAG for our Beam job: + +![DAG for Beam Job](images/TutorialDebuggingServerError/DAG.jpg) + +In this DAG: + +- **ReadTopicModels**: Reads topic models from the datastore and extracts their IDs. +- **ReadTopicSummaries**: Reads topic summary models from the datastore and extracts their IDs. +- **ValidateTopicModels**: Processes each topic model ID to check for a corresponding topic summary model ID, identifying any missing topic summary models. +- **WriteResults**: Writes the validation results to an output sink, including: + - The count of total topic models. + - The count of missing topic summary models. + - Necessary details about each missing topic summary model. + +#### Data Flow: + +- The **ReadTopicModels** step outputs the IDs of the topic models. +- The **ReadTopicSummaryModels** step outputs the IDs of the topic summary models. +- The **ValidateTopicModels** step identifies missing topic summary models. +- The **WriteResults** step generates reports based on the validation. + +#### Batch I/O vs Incremental I/O + +An alternative workflow you might think of could involve iterating over each model and, while iterating over a specific model, validating it and adding the results to a log. +Before diving into different approaches, it's important to understand whether incremental reads and writes are feasible within Apache Beam. + +#### Beam's Batch Processing Model: + +Apache Beam generally operates on a batch processing model, where data is read all at once, processed, and then written out. The idea of interleaving reads and writes within each loop iteration—reading, validating, and then writing one model at a time—isn't supported in this model. While Beam does offer windowing for processing data in chunks, this approach adds complexity and isn't currently used in Oppia's workflows. + +Since incremental I/O isn’t feasible in Beam’s batch model, the batch approach is both necessary and more effective. Here’s why: + +- **Efficiency**: Batch I/O allows for optimized, parallel processing, making it faster than processing one item at a time, which would otherwise slow down due to constant I/O operations. +- **Data Consistency**: Writing all results at the end ensures consistency, avoiding issues like incomplete logs or partial data that could occur with incremental writes, even if they were possible. +- **Simplicity**: Separating reading, processing, and writing into distinct stages makes the workflow clearer and easier to debug. Mixing these steps would only complicate the process, making troubleshooting more difficult. + +## Stage 4: Writing the Beam Job + +With the DAG in mind, we can now proceed to implement the Beam job. Let’s create a new file for our job in the `oppia/core/jobs/batch_jobs` directory. + +> [!IMPORTANT] +> **Practice 3: Decide on suitable names for the module and job.** +> Hint: Follow the conventions mentioned in the [Apache Beam Jobs Wiki](https://github.com/oppia/oppia/wiki/Apache-Beam-Jobs#testing-apache-beam-jobs). + +Per the Oppia documentation for Beam Jobs: + +- The name of the file follows the format `__jobs.py`. In this case, we can use something like `topic_validation_jobs.py`. +- The name of the job follows the convention: `Job`. In this case, we can name the job as `ValidateTopicModelsJob`. + +Let’s first understand the part where we need to fetch the `TopicModels`: + +
+Code for Fetching Topic Models + +```python +topic_models_pcoll = ( + self.pipeline + | 'Get all TopicModels' >> ( + ndb_io.GetModels(topic_models.TopicModel.get_all())) + | 'Extract topic model ids' >> beam.Map( + lambda model: model.id) +) +``` +
+ +In the above code, self.pipeline is the starting point of our Beam pipeline. ndb_io.GetModels(topic_models.TopicModel.get_all()) fetches all TopicModel entities from the datastore. beam.Map(lambda model: model.id) extracts the id from each TopicModel entity. We only need the IDs to check for corresponding TopicSummaryModel entities. + +> [!IMPORTANT] +> Practice 4: Based on the previous step where we fetched all the topic models, can you try writing the code for fetching topic summary models? +> Hint: Refer to the code snippet for fetching TopicModel entities and apply a similar approach. + +
Code for Fetching Topic Summary Models + +```python +topic_summary_models_pcoll = ( + self.pipeline + | 'Get all TopicSummaryModels' >> ( + ndb_io.GetModels(topic_models.TopicSummaryModel.get_all())) + | 'Extract topic summary model ids' >> beam.Map( + lambda model: model.id) +) +``` +
+ +Similar to the previous step, this code fetches all TopicSummaryModel entities and extracts their IDs. This step ensures we have a list of all summary models to compare against our topic models. + +> [!IMPORTANT] +> Practice 5: Now that we have the IDs of both TopicModels and TopicSummaryModels, we need to compare them. If a TopicModel ID > is not present in the TopicSummaryModel IDs, we need to store it so that we can report it later in the next stage. +> Hint: Use the beam.Filter transform to filter out the IDs that are not present in the list of TopicSummaryModel IDs. +> Note that the beam.Filter transform works with a lambda function that returns True for items you want to keep and False for those you want to exclude. +> You might need to use the beam.Map transform to extract the IDs before using beam.Filter. +> For more details, you can also refer to the [Apache Beam Jobs Wiki on PTransforms](https://github.com/oppia/oppia/wiki/Apache-Beam-Jobs#ptransforms). + +
Code for Identifying Missing Topic Summary Models + +```python +missing_topic_summary_models_pcoll = ( + topic_models_pcoll + | 'Identify missing TopicSummaryModels' >> beam.Filter( + lambda topic_id: topic_id not in topic_summary_models_pcoll) +) +``` +
+ +beam.Filter(lambda topic_id: topic_id not in topic_summary_models_pcoll) filter checks if each TopicModel ID has a corresponding TopicSummaryModel ID. +This step helps us identify all topic models that lack corresponding summary models, which is the main goal of our validation job. + + +### Planning Reports and Logs + +Logging and reporting are crucial for understanding the outcome of the Beam job and for debugging. In our validation job, we need to consider what specific information is essential to log and report, along with the purpose of each. + +#### What to Log and Report: + +- **Total Number of Topic Models**: + **Purpose**: This value helps verify that the Beam job is processing the expected number of topic models. If the number is lower or higher than anticipated, it can indicate an issue with reading the data from the datastore, such as missing or duplicated topic models. + **Details Logged**: The total count of topic models successfully read from the datastore. + +- **Total Number of Missing Topic Summary Models**: + **Purpose**: This provides a quick overview of how many topic models are missing corresponding summary models, indicating potential data inconsistencies. Although this information can be derived from point 3, reporting it separately serves as a quick sanity check for admins or developers. If the total number is non-zero, it signals an issue that needs further investigation. + **Details Logged**: The total count of topic models that do not have corresponding topic summary models. + +- **Report Topic IDs with Missing Topic Summary Models**: + **Purpose**: While the total number (logged in point 2) provides a high-level summary, this log provides the specific IDs of the topic models that are missing their summary models. This aids in identifying which exact models need attention and allows for detailed investigation and debugging. + **Details Logged**: A list of the topic model IDs that do not have corresponding topic summary models. + +#### Cross-Consistency Checking + +Reporting both the total number of missing summary models and the specific IDs serves a dual purpose: + +- **Summary for Monitoring**: The total count is useful as a quick sanity check. If it’s non-zero, it signals that there's an issue needing attention. +- **Detailed Debugging**: The list of IDs is critical for detailed debugging and fixing the issues. Developers can use this detailed report to correct the specific inconsistencies in the data. + +By logging and reporting both the high-level totals and the detailed lists, we ensure that the Beam job provides comprehensive information for both quick assessments and in-depth debugging. + +> [!IMPORTANT] +> **Practice 6: Next, we need to generate reports.** +> We have the IDs of all `TopicModels`, the IDs of all `TopicSummaryModels`, and the IDs of the `TopicSummaryModels` that are missing. Let's write the code to generate the following reports: +> - The total number of `TopicModels`. +> - The total number of missing `TopicSummaryModels`. +> - Inform which `TopicModel` is missing a `TopicSummaryModel`. +> **Hint(s)**: +> Use the `job_result_transforms.CountObjectsToJobRunResult` transform to count the number of `TopicModels` and the number of missing `TopicSummaryModels`. +> Remember that this transform will convert the count into a `JobRunResult` with a message indicating the success of the count. +> For reporting the missing `TopicSummaryModels`, use the `beam.Map` transform to create a `JobRunResult` with a message for each missing `TopicSummaryModel`. +> To understand the syntax of the above-mentioned transforms, take a look at how these transforms are used in other jobs at Oppia’s codebase. +> **References for code examples**: +> - [User Stats Computation Jobs](https://github.com/oppia/oppia/blob/f7d88f1dcf90a30eee1dbddca2a2eb5b46d087f1/core/jobs/batch_jobs/user_stats_computation_jobs.py#L49): Example - Look at how the `CountObjectsToJobRunResult` transform is applied to count objects and convert the result into a `JobRunResult`. +> - [User Validation Jobs](https://github.com/oppia/oppia/blob/f7d88f1dcf90a30eee1dbddca2a2eb5b46d087f1/core/jobs/batch_jobs/user_validation_jobs.py#L41): Example - Here, the `beam.Map` transform is used to generate `JobRunResult` messages. Review this to see how the transform processes data and produces detailed logging or reporting. + +With the IDs of both `TopicModels` and `TopicSummaryModels` and the identified missing `TopicSummaryModels`, we can now proceed to generate the reports. This will include the total number of topic models, the total number of missing topic summary models, and detailed information about each missing topic summary model. +Here's one approach to generating the reports: +
+Code for Generating Reports + +```python +number_of_topic_models_report = ( + topic_models_pcoll + | 'Report count of TopicModels' >> ( + job_result_transforms.CountObjectsToJobRunResult( + 'CountTotalTopicModels')) +) + +number_of_missing_topic_summary_models_report = ( + missing_topic_summary_models_pcoll + | 'Report count of missing TopicSummaryModels' >> ( + job_result_transforms.CountObjectsToJobRunResult( + 'CountMissingTopicSummaryModels')) +) + +missing_topic_summary_models_report = ( + missing_topic_summary_models_pcoll + | 'Report missing TopicSummaryModels' >> beam.Map( + lambda topic_id: job_run_result.JobRunResult.as_stderr( + 'TopicModel with id %s is missing a corresponding TopicSummaryModel' % topic_id) + ) +) +``` +
+ +**Report Total Topic Models**: This reports the count of all TopicModel entities. +**Report count of missing TopicSummaryModels**: This reports the count of TopicModel entities that do not have corresponding TopicSummaryModel entities. +**Report missing TopicSummaryModels**: This logs detailed information about each missing summary model. + +
Here's one approach to do that + +```python +return ( + ( + number_of_topic_models_report, + number_of_missing_topic_summary_models_report, + missing_topic_summary_models_report, + ) + | 'Combine reported results' >> beam.Flatten() +) +``` +
+ +This combines all the reports into a single output to give a comprehensive result of our validation job. + +
Here is the complete code with all the above steps: + +```python +from core.jobs import base_jobs +from core.jobs.io import ndb_io +from core.jobs.transforms import job_result_transforms +from core.jobs.types import job_run_result +from core.platform import models + +import apache_beam as beam + +MYPY = False +if MYPY: # pragma: no cover + from mypy_imports import topic_models + +(topic_models,) = models.Registry.import_models([models.Names.TOPIC]) + +class ValidateTopicModelsJob(base_jobs.JobBase): + """Job that validates TopicModel and TopicSummaryModel consistency.""" + + def run(self) -> beam.PCollection[job_run_result.JobRunResult]: + # Fetch all TopicModels + topic_models_pcoll = ( + self.pipeline + | 'Get all TopicModels' >> ( + ndb_io.GetModels(topic_models.TopicModel.get_all())) + | 'Extract topic model ids' >> beam.Map( + lambda model: model.id) + ) + + # Fetch all TopicSummaryModels + topic_summary_models_pcoll = ( + self.pipeline + | 'Get all TopicSummaryModels' >> ( + ndb_io.GetModels(topic_models.TopicSummaryModel.get_all())) + | 'Extract topic summary model ids' >> beam.Map( + lambda model: model.id) + ) + + # Identify missing TopicSummaryModels + missing_topic_summary_models_pcoll = ( + topic_models_pcoll + | 'Identify missing TopicSummaryModels' >> beam.Filter( + lambda topic_id: topic_id not in topic_summary_models_pcoll) + ) + + # Generate reports + number_of_topic_models_report = ( + topic_models_pcoll + | 'Report count of TopicModels' >> ( + job_result_transforms.CountObjectsToJobRunResult( + 'CountTotalTopicModels')) + ) + + number_of_missing_topic_summary_models_report = ( + missing_topic_summary_models_pcoll + | 'Report count of missing TopicSummaryModels' >> ( + job_result_transforms.CountObjectsToJobRunResult( + 'CountMissingTopicSummaryModels')) + ) + + missing_topic_summary_models_report = ( + missing_topic_summary_models_pcoll + | 'Report missing TopicSummaryModels' >> beam.Map( + lambda topic_id: job_run_result.JobRunResult.as_stderr( + 'TopicModel with id %s is missing a corresponding TopicSummaryModel' % topic_id) + ) + ) + + # Combine reports + return ( + ( + number_of_topic_models_report, + number_of_missing_topic_summary_models_report, + missing_topic_summary_models_report, + ) + | 'Combine reported results' >> beam.Flatten() + ) +``` +
+ +To have your job registered and acknowledged by the front-end, make sure to import the module in the corresponding section of `core/jobs/registry.py` + +> Practice 7: Try importing the module in core/jobs/registry.py. Then check out the release coordinator page to verify that the job has been registered. +> Hint: Follow the steps mentioned here in the wiki https://github.com/oppia/oppia/wiki/Apache-Beam-Jobs#local-development-server + +With this, our job is finally completed! + +## Stage 5: Testing the Beam Job + +Let's create unit tests for our `ValidateTopicModelsJob`.The objective of this stage is to ensure that our `ValidateTopicModelsJob` works correctly under different scenarios. We will create unit tests to verify the behavior of our job. Effective testing involves considering various cases, such as: + +- **Null Case**: Ensure the job produces no output when there are no models. +- **Standard Case**: Ensure the job correctly processes typical data. +- **Error Case**: Ensure the job handles unexpected or incorrect data gracefully. + +When drafting test cases for your Beam job, consider the following: + +- **Null Case:** + - **Scenario**: There are no `TopicModels` and `TopicSummaryModels` in the datastore. + - **Expected Outcome**: The job should complete without producing any output, indicating that there are no missing or inconsistent models to report. The output should be an empty list or a specific log message like "No TopicModels found." + +- **Standard Case:** + - **Scenario**: The datastore contains a typical set of `TopicModels` and corresponding `TopicSummaryModels` with correct relationships between them. + - **Expected Outcome**: The job should successfully process the models and report that all models are consistent. For example, "All TopicModels have corresponding TopicSummaryModels," or a success result with counts showing no missing models. + +- **Error Case - Missing Summary Models:** + - **Scenario**: The datastore contains `TopicModels` without corresponding `TopicSummaryModels` or contains corrupted/incomplete data. + - **Expected Outcome**: The job should correctly identify and report the IDs of the `TopicModels` that are missing `TopicSummaryModels`. Additionally, the job should handle any corrupted data gracefully by logging a warning or error message without crashing. The output could include messages like "TopicModel with ID X is missing a corresponding TopicSummaryModel." + +> [!IMPORTANT] +> **Practice 8:** Try to draft the test cases for our Beam jobs. Think of what are the cases that you would like to cover for your Beam job. +> +> **Hint:** Take a look at test files of other jobs to understand how it’s being done. + +Here are the specific cases we will cover: + +1. **Empty Storage:** Ensure the job produces no output when there are no models. +2. **Topic Model without Summary Model:** Ensure the job correctly identifies a topic model without a corresponding summary model. +3. **Topic Model with Summary Model:** Ensure the job does not report an issue when the topic model has a corresponding summary model. +4. **Multiple Topic Models with and without Summary Models:** Test a mix of topic models with and without corresponding summary models. + +To start, we need to create a test class that inherits from `job_test_utils.JobTestBase` and defines our job class. We will also define some constants for the topic IDs we will use in our tests. + +> **General Advice:** When writing unit tests for your Beam job, a common approach involves creating mock data that mimics real-world scenarios. This allows you to simulate how the job would process data in a live environment. To make your tests effective, consider the following workflow: +> +> 1. **Set Up Mock Data:** Create mock versions of the models that your Beam job will process, such as `TopicModel` and `TopicSummaryModel`. These mock objects should reflect realistic data that might be encountered during production. +> 2. **Simulate User or Functionality Flows:** Structure your tests to reflect how the job processes data, including different scenarios like valid data, missing data, or inconsistent data between models. +> 3. **Use Setup Methods:** These methods prepare the datastore and other necessary components before running your tests. This involves creating sample models and ensuring they are correctly loaded into the testing environment. + +Below is an example of how you can apply this advice to create the necessary setup and test cases for our Beam job. This example sets up a test environment by creating a single `TopicModel` along with its associated `TopicRightsModel`. + +
+Example Test Setup + +```python +from __future__ import annotations + +from core import feconf +from core.jobs import job_test_utils +from core.jobs.batch_jobs import topic_validation_jobs +from core.jobs.types import job_run_result +from core.platform import models +from core.domain import topic_domain + +import datetime +from typing import Final, Type + +MYPY = False +if MYPY: + from mypy_imports import topic_models + +(topic_models,) = models.Registry.import_models([models.Names.TOPIC]) + +class ValidateTopicModelsJobTests(job_test_utils.JobTestBase): + + JOB_CLASS: Type[topic_validation_jobs.ValidateTopicModelsJob] = topic_validation_jobs.ValidateTopicModelsJob + + TOPIC_ID: Final = 'topic_id' + story_id_1: Final = 'story_id_1' + story_id_2: Final = 'story_id_2' + story_id_3: Final = 'story_id_3' + + def setUp(self) -> None: + # Set up common attributes for TopicModel and TopicRightsModel. + super().setUp() + self.CANONICAL_NAME = 'canonical_name' + self.LANGUAGE_CODE = 'en' + self.NAME = 'name' + self.NEXT_SUBTOPIC_ID = 1 + self.PAGE_TITLE_FRAGMENT_FOR_WEB = 'page_title_fragment_for_web' + self.STORY_REFERENCE_SCHEMA_VERSION = 1 + self.SUBTOPIC_SCHEMA_VERSION = 1 + self.URL_FRAGMENT = 'url_fragment' + + def create_topic_model(self, id, **kwargs): + """Creates a mock TopicModel with realistic data for testing.""" + # Default values for the mock TopicModel. + default_kwargs = { + 'canonical_name': self.CANONICAL_NAME, + 'language_code': self.LANGUAGE_CODE, + 'name': self.NAME, + 'next_subtopic_id': self.NEXT_SUBTOPIC_ID, + 'page_title_fragment_for_web': self.PAGE_TITLE_FRAGMENT_FOR_WEB, + 'story_reference_schema_version': self.STORY_REFERENCE_SCHEMA_VERSION, + 'subtopic_schema_version': self.SUBTOPIC_SCHEMA_VERSION, + 'url_fragment': self.URL_FRAGMENT, + } + default_kwargs.update(kwargs) + # Create the mock TopicModel. + topic_model = self.create_model(topic_models.TopicModel, id=id, **default_kwargs) + topic_model.update_timestamps() + + # Create and commit the associated TopicRightsModel. + first_topic_rights_model = self.create_model( + topic_models.TopicRightsModel, + id=id, + topic_is_published=False + ) + first_topic_rights_model.commit( + feconf.SYSTEM_COMMITTER_ID, + 'Create topic rights', + [{'cmd': topic_domain.CMD_CREATE_NEW}] + ) + + # Commit the TopicModel. + topic_model.commit( + feconf.SYSTEM_COMMITTER_ID, + 'Create topic rights', + [{'cmd': topic_domain.CMD_CREATE_NEW}] + ) +``` +
+ +Here are tests for the first two cases. Take a look at them and make sure that you understand how they work. + +
+Test Case: Empty Storage + +```python +def test_empty_storage(self) -> None: + # Ensure the job produces no output when there are no models + self.assert_job_output_is_empty() +``` +
+ +
+Test Case: Topic Model without Summary Model + +```python +def test_topic_model_without_summary_model(self) -> None: + # Create a topic model without a corresponding summary model + self.create_topic_model(self.TOPIC_ID) + + # Run the job and assert expected output + self.assert_job_output_is([ + job_run_result.JobRunResult( + stdout='', + stderr='Invalid TopicModel with id: %s' % self.TOPIC_ID + ), + job_run_result.JobRunResult( + stdout='CountInvalidTopicModels SUCCESS: 1', + stderr='' + ) + ]) +``` + +
+ +> [!IMPORTANT] +> Practice 9: Based on the above test cases, try to write the remaining cases on your own: +> +> Topic Model with Summary Model: Ensure that the job does not report an issue when the topic model has a corresponding summary model. +> Multiple Topic Models with and without Summary Models: Test a mix of topic models that have and do not have corresponding summary models. + +
+Test Case: Topic Model with Summary Model + +```python +def test_topic_model_with_summary_model(self) -> None: + # Create a TopicSummaryModel for the existing TopicModel + topic_summary_model = self.create_model( + topic_models.TopicSummaryModel, + id=self.TOPIC_ID, + canonical_name=self.CANONICAL_NAME, + name=self.NAME, + language_code=self.LANGUAGE_CODE, + url_fragment=self.URL_FRAGMENT, + description='dummy description', + canonical_story_count=0, + additional_story_count=0, + total_skill_count=0, + total_published_node_count=0, + uncategorized_skill_count=0, + subtopic_count=0, + version=1, + published_story_exploration_mapping={ + self.story_id_1: [], + self.story_id_2: [], + self.story_id_3: [] + }, + topic_model_last_updated=datetime.datetime.utcnow(), + topic_model_created_on=datetime.datetime.utcnow() + ) + + # Update timestamps and store the model + topic_summary_model.update_timestamps() + self.put_multi([topic_summary_model]) + self.create_topic_model(self.TOPIC_ID) + + # Run the job and assert expected output + self.assert_job_output_is([ + job_run_result.JobRunResult( + stdout='CountValidTopicModels SUCCESS: 1' + ) + ]) +``` + +
+ +
+Test Case: Multiple Topic Models with and without Summary Models + +```python +def test_multiple_topic_models_with_and_without_summary_models(self) -> None: + """Test a mix of topic models with and without corresponding summary models.""" + # Create a topic model without a summary model + self.create_topic_model(self.TOPIC_ID + '_1') + + # Create a topic model with a summary model + self.create_topic_model(self.TOPIC_ID + '_2') + topic_summary_model_2 = self.create_model( + topic_models.TopicSummaryModel, + id=self.TOPIC_ID + '_2', + canonical_name=self.CANONICAL_NAME, + name=self.NAME, + language_code=self.LANGUAGE_CODE, + url_fragment=self.URL_FRAGMENT, + description='dummy description', + canonical_story_count=0, + additional_story_count=0, + total_skill_count=0, + total_published_node_count=0, + uncategorized_skill_count=0, + subtopic_count=0, + version=1, + published_story_exploration_mapping={ + self.story_id_1: [], + self.story_id_2: [], + self.story_id_3: [] + }, + topic_model_last_updated=datetime.datetime.utcnow(), + topic_model_created_on=datetime.datetime.utcnow() + ) + topic_summary_model_2.update_timestamps() + + # Create another topic model with a summary model + self.create_topic_model(self.TOPIC_ID + '_3') + topic_summary_model_3 = self.create_model( + topic_models.TopicSummaryModel, + id=self.TOPIC_ID + '_3', + canonical_name=self.CANONICAL_NAME, + name=self.NAME, + language_code=self.LANGUAGE_CODE, + url_fragment=self.URL_FRAGMENT, + description='dummy description', + canonical_story_count=0, + additional_story_count=0, + total_skill_count=0, + total_published_node_count=0, + uncategorized_skill_count=0, + subtopic_count=0, + version=1, + published_story_exploration_mapping={ + self.story_id_1: [], + self.story_id_2: [], + self.story_id_3: [] + }, + topic_model_last_updated=datetime.datetime.utcnow(), + topic_model_created_on=datetime.datetime.utcnow() + ) + topic_summary_model_3.update_timestamps() + self.put_multi([topic_summary_model_2, topic_summary_model_3]) + + # Run the job and assert expected output + self.assert_job_output_is([ + job_run_result.JobRunResult( + stderr='Invalid TopicModel with id: %s' % (self.TOPIC_ID + '_1') + ), + job_run_result.JobRunResult( + stdout='CountInvalidTopicModels SUCCESS: 1' + ), + job_run_result.JobRunResult( + stdout='CountValidTopicModels SUCCESS: 2' + ) + ]) +``` + +
+ +
+Here is the complete code for the unit tests: + +```python +"""Unit tests for jobs.batch_jobs.topic_validation_jobs.""" + +from __future__ import annotations + +from core import feconf +from core.jobs import job_test_utils +from core.jobs.batch_jobs import topic_validation_jobs +from core.jobs.types import job_run_result +from core.platform import models +from core.domain import topic_domain + +import datetime + +from typing import Final, Type + +MYPY = False +if MYPY: + from mypy_imports import topic_models + +(topic_models,) = models.Registry.import_models([models.Names.TOPIC]) + + +class ValidateTopicModelsJobTests(job_test_utils.JobTestBase): + + JOB_CLASS: Type[topic_validation_jobs.ValidateTopicModelsJob] = topic_validation_jobs.ValidateTopicModelsJob + + TOPIC_ID: Final = 'topic_id' + story_id_1: Final = 'story_id_1' + story_id_2: Final = 'story_id_2' + story_id_3: Final = 'story_id_3' + + def setUp(self) -> None: + super().setUp() + self.CANONICAL_NAME = 'canonical_name' + self.LANGUAGE_CODE = 'en' + self.NAME = 'name' + self.NEXT_SUBTOPIC_ID = 1 + self.PAGE_TITLE_FRAGMENT_FOR_WEB = 'page_title_fragment_for_web' + self.STORY_REFERENCE_SCHEMA_VERSION = 1 + self.SUBTOPIC_SCHEMA_VERSION = 1 + self.URL_FRAGMENT = 'url_fragment' + + def create_topic_model(self, id, **kwargs): + default_kwargs = { + 'canonical_name': self.CANONICAL_NAME, + 'language_code': self.LANGUAGE_CODE, + 'name': self.NAME, + 'next_subtopic_id': self.NEXT_SUBTOPIC_ID, + 'page_title_fragment_for_web': self.PAGE_TITLE_FRAGMENT_FOR_WEB, + 'story_reference_schema_version': self.STORY_REFERENCE_SCHEMA_VERSION, + 'subtopic_schema_version': self.SUBTOPIC_SCHEMA_VERSION, + 'url_fragment': self.URL_FRAGMENT, + } + default_kwargs.update(kwargs) + topic_model = self.create_model(topic_models.TopicModel, id=id, **default_kwargs) + topic_model.update_timestamps() + + first_topic_rights_model = self.create_model( + topic_models.TopicRightsModel, + id=id, + topic_is_published=False + ) + first_topic_rights_model.commit( + feconf.SYSTEM_COMMITTER_ID, + 'Create topic rights', + [{'cmd': topic_domain.CMD_CREATE_NEW}] + ) + + topic_model.commit( + feconf.SYSTEM_COMMITTER_ID, + 'Create topic rights', + [{'cmd': topic_domain.CMD_CREATE_NEW}] + ) + + def test_empty_storage(self) -> None: + self.assert_job_output_is_empty() + + def test_topic_model_without_summary_model(self) -> None: + # Create a topic model without a corresponding summary model + self.create_topic_model(self.TOPIC_ID) + + # Run the job and assert expected output + + self.assert_job_output_is([ + job_run_result.JobRunResult( + stdout='', + stderr='Invalid TopicModel with id: %s' % self.TOPIC_ID + ), + job_run_result.JobRunResult( + stdout='CountInvalidTopicModels SUCCESS: 1', + stderr='' + ) + ]) + + def test_topic_model_with_summary_model(self) -> None: + topic_summary_model = self.create_model( + topic_models.TopicSummaryModel, + id=self.TOPIC_ID, + canonical_name=self.CANONICAL_NAME, + name=self.NAME, + language_code=self.LANGUAGE_CODE, + url_fragment=self.URL_FRAGMENT, + description='dummy description', + canonical_story_count=0, + additional_story_count=0, + total_skill_count=0, + total_published_node_count=0, + uncategorized_skill_count=0, + subtopic_count=0, + version=1, + published_story_exploration_mapping={ + self.story_id_1: [], + self.story_id_2: [], + self.story_id_3: [] + }, + topic_model_last_updated=datetime.datetime.utcnow(), + topic_model_created_on=datetime.datetime.utcnow() + ) + + topic_summary_model.update_timestamps() + self.put_multi([topic_summary_model]) + self.create_topic_model(self.TOPIC_ID) + + self.assert_job_output_is([ + job_run_result.JobRunResult( + stdout='CountValidTopicModels SUCCESS: 1' + ) + ]) + + def test_multiple_topic_models_with_and_without_summary_models(self) -> None: + self.create_topic_model(self.TOPIC_ID + '_1') + self.create_topic_model(self.TOPIC_ID + '_2') + topic_summary_model_2 = self.create_model( + topic_models.TopicSummaryModel, + id=self.TOPIC_ID + '_2', + canonical_name=self.CANONICAL_NAME, + name=self.NAME, + language_code=self.LANGUAGE_CODE, + url_fragment=self.URL_FRAGMENT, + description='dummy description', + canonical_story_count=0, + additional_story_count=0, + total_skill_count=0, + total_published_node_count=0, + uncategorized_skill_count=0, + subtopic_count=0, + version=1, + published_story_exploration_mapping={ + self.story_id_1: [], + self.story_id_2: [], + self.story_id_3: [] + }, + topic_model_last_updated=datetime.datetime.utcnow(), + topic_model_created_on=datetime.datetime.utcnow() + ) + topic_summary_model_2.update_timestamps() + self.create_topic_model(self.TOPIC_ID + '_3') + topic_summary_model_3 = self.create_model( + topic_models.TopicSummaryModel, + id=self.TOPIC_ID + '_3', + canonical_name=self.CANONICAL_NAME, + name=self.NAME, + language_code=self.LANGUAGE_CODE, + url_fragment=self.URL_FRAGMENT, + description='dummy description', + canonical_story_count=0, + additional_story_count=0, + total_skill_count=0, + total_published_node_count=0, + uncategorized_skill_count=0, + subtopic_count=0, + version=1, + published_story_exploration_mapping={ + self.story_id_1: [], + self.story_id_2: [], + self.story_id_3: [] + }, + topic_model_last_updated=datetime.datetime.utcnow(), + topic_model_created_on=datetime.datetime.utcnow() + ) + topic_summary_model_3.update_timestamps() + self.put_multi([topic_summary_model_2, topic_summary_model_3]) + + self.assert_job_output_is([ + job_run_result.JobRunResult( + stderr='Invalid TopicModel with id: %s' % (self.TOPIC_ID + '_1') + ), + job_run_result.JobRunResult( + stdout='CountInvalidTopicModels SUCCESS: 1' + ), + job_run_result.JobRunResult( + stdout='CountValidTopicModels SUCCESS: 2' + ) + ]) +``` + +
+ +## Stage 6: Run and Validate the Job + +Now let’s try running the job on our local server. + +1. Sign in as an administrator ([instructions](https://github.com/oppia/oppia/wiki/How-to-access-Oppia-webpages#log-in-as-a-super-administrator)). +2. Navigate to Admin Page > Roles Tab. +3. Add the "Release Coordinator" role to the username you are signed in with. +4. Navigate to [http://localhost:8181/release-coordinator](http://localhost:8181/release-coordinator), then to the Beam Jobs tab. +5. Search for your job and then click the Play button. +6. Click "Start a new job." + +For the above step, we didn’t create any dummy data; thus, it covers a scenario similar to our first unit test. Let’s create dummy data and see how our job works for that. Generate dummy data by following the steps mentioned in this wiki page: [Populating data on local server](https://github.com/oppia/oppia/wiki/Populating-data-on-local-server#maths-classroom-page). + +> **Practice 10:** Now let's create dummy data to see how our job works with different scenarios. +> +> Follow the steps mentioned in this [wiki page](https://github.com/oppia/oppia/wiki/Populating-data-on-local-server#maths-classroom-page) to generate dummy data on your local server. +> Re-run the job following the same steps as before. + +The beam job doesn’t report any topic ID with missing topic summary models. That is because, by default, all the topic models are accompanied by their corresponding `TopicSummaryModel`. + +> **Practice 11:** To test the job's ability to identify missing `TopicSummaryModel`: +> +> - Set up a local datastore using DSAdmin by following [these instructions](https://github.com/oppia/oppia/wiki/Debugging-datastore-locally). +> - Manually delete one of the `TopicSummaryModel` entries. +> - Re-run the job to check if it correctly identifies the missing `TopicSummaryModel`. + +The beam job will report the topic ID for which you deleted the topic-summary model this time. + +> **Practice 12:** Now that you have a good understanding of Beam jobs at Oppia, let's add one more case to cover. Your job should also report any `TopicSummaryModel` that lacks a corresponding `TopicModel`. +> +> **Hint:** You'll need to adjust the job logic to include a check for `TopicSummaryModel` entries without corresponding `TopicModel` entries. Ensure that the unit tests also cover this scenario. Test the job by manually creating or deleting entries in your local datastore to mimic this scenario. + +## Conclusion + +Congratulations! You've written a Beam job that validates the consistency between `TopicModel` and `TopicSummaryModel`. You have also learned how to test your job with various scenarios to ensure its correctness. + +For further reading and more complex scenarios, refer to the [Apache Beam documentation](https://beam.apache.org/documentation/) and [Oppia's developer guides](https://github.com/oppia/oppia/wiki). + +### Additional Steps for Production Deployment + +> **Note:** If you want to get this job run and deployed in production, there are additional steps to follow. These steps ensure that your job runs smoothly in a live environment and produces the expected results. + +Before a job can be run and deployed in production, it must first be tested on the Oppia backup server. Here are the requirements for a job to be considered "fully tested": + +- **No Failures:** The job should run without failures on the Oppia backup server. +- **Expected Output:** The job produces the expected output. +- **Expected Outcome:** The job has the expected outcome, which must be verified by user-facing changes, a validation job, or an output check. +- **Approval:** The job should be explicitly approved by the server jobs admin. + +To understand the full process of testing Beam jobs at Oppia, please refer to the [Oppia Wiki on Testing Jobs](https://github.com/oppia/oppia/wiki/Testing-jobs-and-other-features-on-production). This wiki page includes detailed instructions and a template for submitting your job for production deployment. The template provides a structured way to request testing and approval, and the linked doc shows how to fill it out specifically for this job. + +By following these steps, you'll ensure that your Beam job is ready for production and can be deployed to help maintain the integrity and consistency of data in Oppia. + +## We Value Your Feedback + +Did you find this tutorial useful? Or, did you encounter any issues or find things hard to grasp? Let us know by opening a discussion on [GitHub Discussions](https://github.com/oppia/oppia/discussions/new?category=tutorial-feedback). We would be happy to help you and make improvements as needed! From 6d864d156f590495161563241fd9cc500049afce Mon Sep 17 00:00:00 2001 From: Ashwath Kannan Date: Fri, 4 Oct 2024 01:20:55 +0530 Subject: [PATCH 2/9] Stage 1 --- ...the-reproduction-steps-for-server-error.md | 968 ++---------------- 1 file changed, 73 insertions(+), 895 deletions(-) diff --git a/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md b/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md index 1e32cd42..ff58434d 100644 --- a/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md +++ b/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md @@ -1,23 +1,23 @@ -## Introduction +# Tutorial: Learn How to Figure Out the Reproduction Steps for a Server Error -This tutorial will guide you through debugging a server error that is challenging to reproduce locally. Specifically, we will investigate and fix a TypeError related to certificate generation for contributors. +## Introduction +This tutorial will guide you through debugging a server error that is challenging to reproduce locally. Specifically, we will investigate and fix a `TypeError` related to certificate generation for contributors. -## Skills Covered +### Skills Covered: - Codebase Navigation - Identifying and Analyzing Error Logs - Debugging Techniques - Reproducing Server Errors Locally - -## Scenario +### Scenario: One of the server admins has reported the following error logs. Your task is to investigate the issue and determine how and why it is occurring. > **Note:** The primary goal of this tutorial is not to find a solution, but to guide you through the process of investigating and understanding the workflow of debugging server errors. In this tutorial, you will follow the steps a developer might take to investigate this server error. -```python +```plaintext TypeError: expected string or bytes-like object Exception raised: expected string or bytes-like object @@ -39,936 +39,114 @@ Traceback (most recent call last): TypeError: expected string or bytes-like object ``` -## Prerequisites - -Before you begin, make sure you have: - -- Set up your development environment. (If you haven't, follow the [Oppia setup instructions](https://github.com/oppia/oppia/wiki/Installing-Oppia).) -- Familiarize yourself with [Debugging at Oppia wiki page.](https://github.com/oppia/oppia/wiki/Debugging). -- Before starting this tutorial, it is important to have a basic understanding of the following key concepts: - - **Exploration**: Familiarize yourself with explorations by reviewing the [Oppia User Documentation on Explorations](https://oppia.github.io/#/KeyConceptsInOppia). - - **Topic**: Learn about topics by visiting the [relevant wiki page](https://github.com/oppia/oppia/wiki/How-to-access-Oppia-webpages#overview-of-entities). - - These concepts are fundamental to understanding how content is structured in Oppia, and they will help you follow the steps in this tutorial more easily. - -## Procedure - -> The following steps illustrate how a developer might tackle this issue. Try following this tutorial step-by-step on your local machine! This will give you a better sense of how to tackle other similar issues in the codebase. If you get stuck with a step in this tutorial, raise an issue in (GitHub Discussions)[https://github.com/oppia/oppia/discussions/categories/tutorial-feedback] to get help. - -> **Important:** When you see a “practice question box”, stop and try to figure out the answer on your own before reading ahead. You will learn more if you try to figure out your own answer to the question first! - - - -## Stage 1: Understand the Architecture of Beam Jobs at Oppia - -Beam jobs at Oppia are typically structured in a modular fashion, where the main components include: - -- **Pipeline Definition**: This is where you define the structure of your Beam pipeline. -- **Transforms**: These are the processing steps that operate on the data. -- **IO**: Input and output operations to read from and write to data sources. -- **Testing**: Unit tests to ensure the job is correct. - -Refer to the [Beam Jobs at Oppia documentation](https://github.com/oppia/oppia/wiki/Apache-Beam-Jobs#writing-apache-beam-jobs) for a detailed overview. - -## Stage 2: Understand the Models - -> [!IMPORTANT] -> **Practice 1: Locate the definition of `TopicModel` and `TopicSummaryModel`.** -> You can utilize the quick search functionality of your code editor to facilitate this process. -> **Hint**: Take a look at this wiki page on [Tips for common IDEs](https://github.com/oppia/oppia/wiki/Tips-for-common-IDEs#visual-studio-code). - -Before diving into the implementation, it's crucial to understand the models we will be working with. - -
-Topic Model (oppia/core/storage/topic/gae_models.py) - -```python -class TopicModel(base_models.VersionedModel): - """Model for storing Topics. - This class should only be imported by the topic services file and the topic model test file. - """ -``` -
- -The TopicModel is used for storing detailed information about topics. This model inherits from Versioned Model and thus is versioned, meaning it maintains a history of changes made to each instance. If you are curious about how versions are maintained and want to take a proper look, take a look at the VersionedModel definition in file oppia/core/storage/base_model/gae_models.py. - -
-Topic Summary Model (oppia/core/storage/topic/gae_models.py) - -```python -class TopicSummaryModel(base_models.BaseModel): - """Summary model for an Oppia Topic. - This should be used whenever the content blob of the topic is not needed (e.g. search results, etc). - The key of each instance is the topic id. - """ - -``` -
- -The TopicSummaryModel provides a summarized view of a topic, which is useful for scenarios where the full details of a topic’s contents are not required, such as search results. Each TopicSummaryModel instance is identified by the topic id. - -## Stage 3: Draft the Directed Acyclic Graph (DAG) - -Before we write any code, let's visualize the workflow of our Beam job as a Directed Acyclic Graph (DAG). This will help us understand the sequence of operations and the flow of data through the pipeline. - -#### What is a DAG? -Like all graphs, a directed acyclic graph (DAG) consists of nodes connected by edges. In this case, the nodes are steps in the job, and the edges indicate the order in which to complete the steps. The edges are thus directional (hence "directed"), and the graph isn't allowed to have any cycles (hence "acyclic"). In other words, it should be impossible to start at one node and follow the edges back to the same node, as this would create an infinite loop in our job. - -For more detailed information about DAGs, you can refer to the [DAG Wikipedia](https://en.wikipedia.org/wiki/Directed_acyclic_graph) page. - -Visualizing our Beam job as a DAG helps in planning the structure and flow of our data processing pipeline. It provides a clear picture of how data moves from one step to another, ensuring that all necessary operations are performed in the correct order. - -### Step 3.1: Define the Job's Objective +## Procedure: -> [!IMPORTANT] -> Practice 2: Take a notebook and try drafting a rough workflow of what our job would do, using boxes for the steps and arrows to connect different steps. -> Hint: -> - Read Everything First: Start by reading all the necessary data at the beginning of the job. This ensures that you have all the required information before performing any operations. -> - Process Data in Steps: Break down the job's functionality into simpler steps, such as filtering, transforming, and aggregating the data. Each step should be a separate node in your DAG. -> - Write Everything Last: Ensure that all writing operations, such as saving results or updating models, are performed at the end of the job. This helps in maintaining data consistency and avoids incomplete writes. +The following steps illustrate how a developer might tackle this issue. Try following this tutorial step-by-step on your local machine! This will give you a better sense of how to tackle other similar issues in the codebase. If you get stuck with a step in this tutorial, raise an issue in GitHub Discussions to get help. -The job's objective is to validate that each topic model in the datastore has a corresponding topic summary model. The workflow can be broken down into the following steps: +**Important**: When you see a “practice question box”, stop and try to figure out the answer on your own before reading ahead. You will learn more if you try to figure out your own answer to the question first! -- **Read Topic Models**: Read all the topic models from the datastore. -- **Validate Topic Models**: For each topic model, check if there is a corresponding topic summary model. -- **Output Results**: Write the validation results to an output sink (e.g., a text file). +#### Setup: -### Step 3.2: Draw the Directed Acyclic Graph (DAG) +**Install Oppia on Your Local Machine** +To begin, you'll need to have Oppia installed on your local machine. If you haven't done so already, please follow the installation steps provided in this [wiki page](https://github.com/oppia/oppia/wiki). -Here's a simple representation of the DAG for our Beam job: +**Check Out the Specific Commit** +To ensure that you are working with the same version of the code as this tutorial, navigate to your local Oppia directory and check out the specific commit: -![DAG for Beam Job](images/TutorialDebuggingServerError/DAG.jpg) - -In this DAG: - -- **ReadTopicModels**: Reads topic models from the datastore and extracts their IDs. -- **ReadTopicSummaries**: Reads topic summary models from the datastore and extracts their IDs. -- **ValidateTopicModels**: Processes each topic model ID to check for a corresponding topic summary model ID, identifying any missing topic summary models. -- **WriteResults**: Writes the validation results to an output sink, including: - - The count of total topic models. - - The count of missing topic summary models. - - Necessary details about each missing topic summary model. - -#### Data Flow: - -- The **ReadTopicModels** step outputs the IDs of the topic models. -- The **ReadTopicSummaryModels** step outputs the IDs of the topic summary models. -- The **ValidateTopicModels** step identifies missing topic summary models. -- The **WriteResults** step generates reports based on the validation. - -#### Batch I/O vs Incremental I/O - -An alternative workflow you might think of could involve iterating over each model and, while iterating over a specific model, validating it and adding the results to a log. -Before diving into different approaches, it's important to understand whether incremental reads and writes are feasible within Apache Beam. - -#### Beam's Batch Processing Model: - -Apache Beam generally operates on a batch processing model, where data is read all at once, processed, and then written out. The idea of interleaving reads and writes within each loop iteration—reading, validating, and then writing one model at a time—isn't supported in this model. While Beam does offer windowing for processing data in chunks, this approach adds complexity and isn't currently used in Oppia's workflows. - -Since incremental I/O isn’t feasible in Beam’s batch model, the batch approach is both necessary and more effective. Here’s why: - -- **Efficiency**: Batch I/O allows for optimized, parallel processing, making it faster than processing one item at a time, which would otherwise slow down due to constant I/O operations. -- **Data Consistency**: Writing all results at the end ensures consistency, avoiding issues like incomplete logs or partial data that could occur with incremental writes, even if they were possible. -- **Simplicity**: Separating reading, processing, and writing into distinct stages makes the workflow clearer and easier to debug. Mixing these steps would only complicate the process, making troubleshooting more difficult. - -## Stage 4: Writing the Beam Job - -With the DAG in mind, we can now proceed to implement the Beam job. Let’s create a new file for our job in the `oppia/core/jobs/batch_jobs` directory. - -> [!IMPORTANT] -> **Practice 3: Decide on suitable names for the module and job.** -> Hint: Follow the conventions mentioned in the [Apache Beam Jobs Wiki](https://github.com/oppia/oppia/wiki/Apache-Beam-Jobs#testing-apache-beam-jobs). - -Per the Oppia documentation for Beam Jobs: - -- The name of the file follows the format `__jobs.py`. In this case, we can use something like `topic_validation_jobs.py`. -- The name of the job follows the convention: `Job`. In this case, we can name the job as `ValidateTopicModelsJob`. - -Let’s first understand the part where we need to fetch the `TopicModels`: - -
-Code for Fetching Topic Models - -```python -topic_models_pcoll = ( - self.pipeline - | 'Get all TopicModels' >> ( - ndb_io.GetModels(topic_models.TopicModel.get_all())) - | 'Extract topic model ids' >> beam.Map( - lambda model: model.id) -) +```bash +git checkout 192f0a9a4866debac160015bc949130aaae6a7fe ``` -
- -In the above code, self.pipeline is the starting point of our Beam pipeline. ndb_io.GetModels(topic_models.TopicModel.get_all()) fetches all TopicModel entities from the datastore. beam.Map(lambda model: model.id) extracts the id from each TopicModel entity. We only need the IDs to check for corresponding TopicSummaryModel entities. - -> [!IMPORTANT] -> Practice 4: Based on the previous step where we fetched all the topic models, can you try writing the code for fetching topic summary models? -> Hint: Refer to the code snippet for fetching TopicModel entities and apply a similar approach. -
Code for Fetching Topic Summary Models +**Verify the Commit:** +You can verify that you are on the correct commit by running: -```python -topic_summary_models_pcoll = ( - self.pipeline - | 'Get all TopicSummaryModels' >> ( - ndb_io.GetModels(topic_models.TopicSummaryModel.get_all())) - | 'Extract topic summary model ids' >> beam.Map( - lambda model: model.id) -) +```bash +git log -1 ``` -
- -Similar to the previous step, this code fetches all TopicSummaryModel entities and extracts their IDs. This step ensures we have a list of all summary models to compare against our topic models. -> [!IMPORTANT] -> Practice 5: Now that we have the IDs of both TopicModels and TopicSummaryModels, we need to compare them. If a TopicModel ID > is not present in the TopicSummaryModel IDs, we need to store it so that we can report it later in the next stage. -> Hint: Use the beam.Filter transform to filter out the IDs that are not present in the list of TopicSummaryModel IDs. -> Note that the beam.Filter transform works with a lambda function that returns True for items you want to keep and False for those you want to exclude. -> You might need to use the beam.Map transform to extract the IDs before using beam.Filter. -> For more details, you can also refer to the [Apache Beam Jobs Wiki on PTransforms](https://github.com/oppia/oppia/wiki/Apache-Beam-Jobs#ptransforms). - -
Code for Identifying Missing Topic Summary Models - -```python -missing_topic_summary_models_pcoll = ( - topic_models_pcoll - | 'Identify missing TopicSummaryModels' >> beam.Filter( - lambda topic_id: topic_id not in topic_summary_models_pcoll) -) -``` -
+The output should display the commit ID `192f0a9a4866debac160015bc949130aaae6a7fe`. -beam.Filter(lambda topic_id: topic_id not in topic_summary_models_pcoll) filter checks if each TopicModel ID has a corresponding TopicSummaryModel ID. -This step helps us identify all topic models that lack corresponding summary models, which is the main goal of our validation job. +#### Debugging Process: +When faced with a server error, developers at Oppia typically follow these steps: -### Planning Reports and Logs +**Analyze the Error Logs to Locate the Affected Code:** Start by reviewing the error logs to find a stack trace that indicates where the error occurred. Pinpoint the relevant file, function, or line number where the problem originated. -Logging and reporting are crucial for understanding the outcome of the Beam job and for debugging. In our validation job, we need to consider what specific information is essential to log and report, along with the purpose of each. +**Examine the Affected Code:** Open the identified file(s) and examine the specific code blocks related to the error. Understand the intended functionality of the code and check for any immediate errors or inconsistencies. -#### What to Log and Report: +**Investigate Potential Causes by Exploring the Code in Depth:** Consider possible reasons for the error, such as incorrect data types, missing conditions, edge cases, or unexpected inputs. Dive deeper into the code, focusing on sections that are most likely causing the issue. Look for logic errors, unhandled cases, or data processing problems that align with your initial suspicions. -- **Total Number of Topic Models**: - **Purpose**: This value helps verify that the Beam job is processing the expected number of topic models. If the number is lower or higher than anticipated, it can indicate an issue with reading the data from the datastore, such as missing or duplicated topic models. - **Details Logged**: The total count of topic models successfully read from the datastore. +**Reproduce the Error:** Set up an environment to recreate the conditions that led to the error. Use test data or modify unit tests to replicate the issue and confirm your hypotheses about its cause. -- **Total Number of Missing Topic Summary Models**: - **Purpose**: This provides a quick overview of how many topic models are missing corresponding summary models, indicating potential data inconsistencies. Although this information can be derived from point 3, reporting it separately serves as a quick sanity check for admins or developers. If the total number is non-zero, it signals an issue that needs further investigation. - **Details Logged**: The total count of topic models that do not have corresponding topic summary models. +**Document Your Findings:** Once you've identified and confirmed the cause of the error, document your findings in detail on the original GitHub issue. Provide a summary of the error, clear steps to reproduce it, and any relevant observations or logs to support your conclusions. -- **Report Topic IDs with Missing Topic Summary Models**: - **Purpose**: While the total number (logged in point 2) provides a high-level summary, this log provides the specific IDs of the topic models that are missing their summary models. This aids in identifying which exact models need attention and allows for detailed investigation and debugging. - **Details Logged**: A list of the topic model IDs that do not have corresponding topic summary models. +### Stage 1: Analyze the Error Logs to Locate the Affected Code +**Objective**: Identify where the error occurred in the code. + +> **Note**: This tutorial focuses on server errors, which are typically reported by server admins and come with error logs. For other issues, such as user-reported bugs, error logs might not always be available. In these cases, it is essential to ask users for "steps to reproduce the bug." If this information is not provided, we can contact the user to request additional details that will help us understand the issue better. -#### Cross-Consistency Checking - -Reporting both the total number of missing summary models and the specific IDs serves a dual purpose: - -- **Summary for Monitoring**: The total count is useful as a quick sanity check. If it’s non-zero, it signals that there's an issue needing attention. -- **Detailed Debugging**: The list of IDs is critical for detailed debugging and fixing the issues. Developers can use this detailed report to correct the specific inconsistencies in the data. - -By logging and reporting both the high-level totals and the detailed lists, we ensure that the Beam job provides comprehensive information for both quick assessments and in-depth debugging. - -> [!IMPORTANT] -> **Practice 6: Next, we need to generate reports.** -> We have the IDs of all `TopicModels`, the IDs of all `TopicSummaryModels`, and the IDs of the `TopicSummaryModels` that are missing. Let's write the code to generate the following reports: -> - The total number of `TopicModels`. -> - The total number of missing `TopicSummaryModels`. -> - Inform which `TopicModel` is missing a `TopicSummaryModel`. -> **Hint(s)**: -> Use the `job_result_transforms.CountObjectsToJobRunResult` transform to count the number of `TopicModels` and the number of missing `TopicSummaryModels`. -> Remember that this transform will convert the count into a `JobRunResult` with a message indicating the success of the count. -> For reporting the missing `TopicSummaryModels`, use the `beam.Map` transform to create a `JobRunResult` with a message for each missing `TopicSummaryModel`. -> To understand the syntax of the above-mentioned transforms, take a look at how these transforms are used in other jobs at Oppia’s codebase. -> **References for code examples**: -> - [User Stats Computation Jobs](https://github.com/oppia/oppia/blob/f7d88f1dcf90a30eee1dbddca2a2eb5b46d087f1/core/jobs/batch_jobs/user_stats_computation_jobs.py#L49): Example - Look at how the `CountObjectsToJobRunResult` transform is applied to count objects and convert the result into a `JobRunResult`. -> - [User Validation Jobs](https://github.com/oppia/oppia/blob/f7d88f1dcf90a30eee1dbddca2a2eb5b46d087f1/core/jobs/batch_jobs/user_validation_jobs.py#L41): Example - Here, the `beam.Map` transform is used to generate `JobRunResult` messages. Review this to see how the transform processes data and produces detailed logging or reporting. - -With the IDs of both `TopicModels` and `TopicSummaryModels` and the identified missing `TopicSummaryModels`, we can now proceed to generate the reports. This will include the total number of topic models, the total number of missing topic summary models, and detailed information about each missing topic summary model. -Here's one approach to generating the reports: -
-Code for Generating Reports +Here is the error log we need to investigate: ```python -number_of_topic_models_report = ( - topic_models_pcoll - | 'Report count of TopicModels' >> ( - job_result_transforms.CountObjectsToJobRunResult( - 'CountTotalTopicModels')) -) - -number_of_missing_topic_summary_models_report = ( - missing_topic_summary_models_pcoll - | 'Report count of missing TopicSummaryModels' >> ( - job_result_transforms.CountObjectsToJobRunResult( - 'CountMissingTopicSummaryModels')) -) - -missing_topic_summary_models_report = ( - missing_topic_summary_models_pcoll - | 'Report missing TopicSummaryModels' >> beam.Map( - lambda topic_id: job_run_result.JobRunResult.as_stderr( - 'TopicModel with id %s is missing a corresponding TopicSummaryModel' % topic_id) - ) -) -``` -
- -**Report Total Topic Models**: This reports the count of all TopicModel entities. -**Report count of missing TopicSummaryModels**: This reports the count of TopicModel entities that do not have corresponding TopicSummaryModel entities. -**Report missing TopicSummaryModels**: This logs detailed information about each missing summary model. - -
Here's one approach to do that - -```python -return ( - ( - number_of_topic_models_report, - number_of_missing_topic_summary_models_report, - missing_topic_summary_models_report, - ) - | 'Combine reported results' >> beam.Flatten() -) -``` -
- -This combines all the reports into a single output to give a comprehensive result of our validation job. - -
Here is the complete code with all the above steps: - -```python -from core.jobs import base_jobs -from core.jobs.io import ndb_io -from core.jobs.transforms import job_result_transforms -from core.jobs.types import job_run_result -from core.platform import models - -import apache_beam as beam - -MYPY = False -if MYPY: # pragma: no cover - from mypy_imports import topic_models - -(topic_models,) = models.Registry.import_models([models.Names.TOPIC]) - -class ValidateTopicModelsJob(base_jobs.JobBase): - """Job that validates TopicModel and TopicSummaryModel consistency.""" - - def run(self) -> beam.PCollection[job_run_result.JobRunResult]: - # Fetch all TopicModels - topic_models_pcoll = ( - self.pipeline - | 'Get all TopicModels' >> ( - ndb_io.GetModels(topic_models.TopicModel.get_all())) - | 'Extract topic model ids' >> beam.Map( - lambda model: model.id) - ) - - # Fetch all TopicSummaryModels - topic_summary_models_pcoll = ( - self.pipeline - | 'Get all TopicSummaryModels' >> ( - ndb_io.GetModels(topic_models.TopicSummaryModel.get_all())) - | 'Extract topic summary model ids' >> beam.Map( - lambda model: model.id) - ) - - # Identify missing TopicSummaryModels - missing_topic_summary_models_pcoll = ( - topic_models_pcoll - | 'Identify missing TopicSummaryModels' >> beam.Filter( - lambda topic_id: topic_id not in topic_summary_models_pcoll) - ) - - # Generate reports - number_of_topic_models_report = ( - topic_models_pcoll - | 'Report count of TopicModels' >> ( - job_result_transforms.CountObjectsToJobRunResult( - 'CountTotalTopicModels')) - ) - - number_of_missing_topic_summary_models_report = ( - missing_topic_summary_models_pcoll - | 'Report count of missing TopicSummaryModels' >> ( - job_result_transforms.CountObjectsToJobRunResult( - 'CountMissingTopicSummaryModels')) - ) - - missing_topic_summary_models_report = ( - missing_topic_summary_models_pcoll - | 'Report missing TopicSummaryModels' >> beam.Map( - lambda topic_id: job_run_result.JobRunResult.as_stderr( - 'TopicModel with id %s is missing a corresponding TopicSummaryModel' % topic_id) - ) - ) - - # Combine reports - return ( - ( - number_of_topic_models_report, - number_of_missing_topic_summary_models_report, - missing_topic_summary_models_report, - ) - | 'Combine reported results' >> beam.Flatten() - ) -``` -
- -To have your job registered and acknowledged by the front-end, make sure to import the module in the corresponding section of `core/jobs/registry.py` - -> Practice 7: Try importing the module in core/jobs/registry.py. Then check out the release coordinator page to verify that the job has been registered. -> Hint: Follow the steps mentioned here in the wiki https://github.com/oppia/oppia/wiki/Apache-Beam-Jobs#local-development-server - -With this, our job is finally completed! - -## Stage 5: Testing the Beam Job - -Let's create unit tests for our `ValidateTopicModelsJob`.The objective of this stage is to ensure that our `ValidateTopicModelsJob` works correctly under different scenarios. We will create unit tests to verify the behavior of our job. Effective testing involves considering various cases, such as: - -- **Null Case**: Ensure the job produces no output when there are no models. -- **Standard Case**: Ensure the job correctly processes typical data. -- **Error Case**: Ensure the job handles unexpected or incorrect data gracefully. - -When drafting test cases for your Beam job, consider the following: - -- **Null Case:** - - **Scenario**: There are no `TopicModels` and `TopicSummaryModels` in the datastore. - - **Expected Outcome**: The job should complete without producing any output, indicating that there are no missing or inconsistent models to report. The output should be an empty list or a specific log message like "No TopicModels found." - -- **Standard Case:** - - **Scenario**: The datastore contains a typical set of `TopicModels` and corresponding `TopicSummaryModels` with correct relationships between them. - - **Expected Outcome**: The job should successfully process the models and report that all models are consistent. For example, "All TopicModels have corresponding TopicSummaryModels," or a success result with counts showing no missing models. - -- **Error Case - Missing Summary Models:** - - **Scenario**: The datastore contains `TopicModels` without corresponding `TopicSummaryModels` or contains corrupted/incomplete data. - - **Expected Outcome**: The job should correctly identify and report the IDs of the `TopicModels` that are missing `TopicSummaryModels`. Additionally, the job should handle any corrupted data gracefully by logging a warning or error message without crashing. The output could include messages like "TopicModel with ID X is missing a corresponding TopicSummaryModel." - -> [!IMPORTANT] -> **Practice 8:** Try to draft the test cases for our Beam jobs. Think of what are the cases that you would like to cover for your Beam job. -> -> **Hint:** Take a look at test files of other jobs to understand how it’s being done. - -Here are the specific cases we will cover: - -1. **Empty Storage:** Ensure the job produces no output when there are no models. -2. **Topic Model without Summary Model:** Ensure the job correctly identifies a topic model without a corresponding summary model. -3. **Topic Model with Summary Model:** Ensure the job does not report an issue when the topic model has a corresponding summary model. -4. **Multiple Topic Models with and without Summary Models:** Test a mix of topic models with and without corresponding summary models. - -To start, we need to create a test class that inherits from `job_test_utils.JobTestBase` and defines our job class. We will also define some constants for the topic IDs we will use in our tests. - -> **General Advice:** When writing unit tests for your Beam job, a common approach involves creating mock data that mimics real-world scenarios. This allows you to simulate how the job would process data in a live environment. To make your tests effective, consider the following workflow: -> -> 1. **Set Up Mock Data:** Create mock versions of the models that your Beam job will process, such as `TopicModel` and `TopicSummaryModel`. These mock objects should reflect realistic data that might be encountered during production. -> 2. **Simulate User or Functionality Flows:** Structure your tests to reflect how the job processes data, including different scenarios like valid data, missing data, or inconsistent data between models. -> 3. **Use Setup Methods:** These methods prepare the datastore and other necessary components before running your tests. This involves creating sample models and ensuring they are correctly loaded into the testing environment. - -Below is an example of how you can apply this advice to create the necessary setup and test cases for our Beam job. This example sets up a test environment by creating a single `TopicModel` along with its associated `TopicRightsModel`. - -
-Example Test Setup +TypeError: expected string or bytes-like object -```python -from __future__ import annotations - -from core import feconf -from core.jobs import job_test_utils -from core.jobs.batch_jobs import topic_validation_jobs -from core.jobs.types import job_run_result -from core.platform import models -from core.domain import topic_domain - -import datetime -from typing import Final, Type - -MYPY = False -if MYPY: - from mypy_imports import topic_models - -(topic_models,) = models.Registry.import_models([models.Names.TOPIC]) - -class ValidateTopicModelsJobTests(job_test_utils.JobTestBase): - - JOB_CLASS: Type[topic_validation_jobs.ValidateTopicModelsJob] = topic_validation_jobs.ValidateTopicModelsJob - - TOPIC_ID: Final = 'topic_id' - story_id_1: Final = 'story_id_1' - story_id_2: Final = 'story_id_2' - story_id_3: Final = 'story_id_3' - - def setUp(self) -> None: - # Set up common attributes for TopicModel and TopicRightsModel. - super().setUp() - self.CANONICAL_NAME = 'canonical_name' - self.LANGUAGE_CODE = 'en' - self.NAME = 'name' - self.NEXT_SUBTOPIC_ID = 1 - self.PAGE_TITLE_FRAGMENT_FOR_WEB = 'page_title_fragment_for_web' - self.STORY_REFERENCE_SCHEMA_VERSION = 1 - self.SUBTOPIC_SCHEMA_VERSION = 1 - self.URL_FRAGMENT = 'url_fragment' - - def create_topic_model(self, id, **kwargs): - """Creates a mock TopicModel with realistic data for testing.""" - # Default values for the mock TopicModel. - default_kwargs = { - 'canonical_name': self.CANONICAL_NAME, - 'language_code': self.LANGUAGE_CODE, - 'name': self.NAME, - 'next_subtopic_id': self.NEXT_SUBTOPIC_ID, - 'page_title_fragment_for_web': self.PAGE_TITLE_FRAGMENT_FOR_WEB, - 'story_reference_schema_version': self.STORY_REFERENCE_SCHEMA_VERSION, - 'subtopic_schema_version': self.SUBTOPIC_SCHEMA_VERSION, - 'url_fragment': self.URL_FRAGMENT, - } - default_kwargs.update(kwargs) - # Create the mock TopicModel. - topic_model = self.create_model(topic_models.TopicModel, id=id, **default_kwargs) - topic_model.update_timestamps() - - # Create and commit the associated TopicRightsModel. - first_topic_rights_model = self.create_model( - topic_models.TopicRightsModel, - id=id, - topic_is_published=False - ) - first_topic_rights_model.commit( - feconf.SYSTEM_COMMITTER_ID, - 'Create topic rights', - [{'cmd': topic_domain.CMD_CREATE_NEW}] - ) - - # Commit the TopicModel. - topic_model.commit( - feconf.SYSTEM_COMMITTER_ID, - 'Create topic rights', - [{'cmd': topic_domain.CMD_CREATE_NEW}] - ) +Exception raised: expected string or bytes-like object +Traceback (most recent call last): + File "/layers/google.python.pip/pip/lib/python3.8/site-packages/webapp2.py", line 604, in dispatch + return method(*args, **kwargs) + File "/workspace/core/controllers/acl_decorators.py", line 4788, in test_can_fetch_all_contributor_dashboard_stats + return handler(self, username, **kwargs) + File "/workspace/core/controllers/contributor_dashboard.py", line 1062, in get + response = suggestion_services.generate_contributor_certificate_data( + File "/workspace/core/domain/suggestion_services.py", line 3870, in generate_contributor_certificate_data + data = _generate_translation_contributor_certificate_data( + File "/workspace/core/domain/suggestion_services.py", line 3945, in _generate_translation_contributor_certificate_data + plain_text = _get_plain_text_from_html_content_string( + File "/workspace/core/domain/suggestion_services.py", line 1408, in _get_plain_text_from_html_content_string + html_content_string_with_rte_tags_replaced = re.sub( + File "/layers/google.python.runtime/python/lib/python3.8/re.py", line 210, in sub + return _compile(pattern, flags).sub(repl, string, count) +TypeError: expected string or bytes-like object ``` -
-Here are tests for the first two cases. Take a look at them and make sure that you understand how they work. +#### Understanding the Stack Trace: -
-Test Case: Empty Storage +A **stack trace** is a report of the active stack frames at a certain point in time during the execution of a program. It shows the call sequence leading to the point where the error occurred. Each line provides the file name, line number, and function where the error occurred. -```python -def test_empty_storage(self) -> None: - # Ensure the job produces no output when there are no models - self.assert_job_output_is_empty() -``` -
- -
-Test Case: Topic Model without Summary Model +- The stack trace is read **from the bottom up** to understand the sequence of function calls leading to the error. +- The bottom-most lines generally point to standard library or third-party code (in this case, `re.py` from Python's standard library). +- Moving upward, you start seeing calls within the project codebase, which is where we should focus. +- The key is to find the first instance where **Oppia's code interacts with the point of failure**. -```python -def test_topic_model_without_summary_model(self) -> None: - # Create a topic model without a corresponding summary model - self.create_topic_model(self.TOPIC_ID) - - # Run the job and assert expected output - self.assert_job_output_is([ - job_run_result.JobRunResult( - stdout='', - stderr='Invalid TopicModel with id: %s' % self.TOPIC_ID - ), - job_run_result.JobRunResult( - stdout='CountInvalidTopicModels SUCCESS: 1', - stderr='' - ) - ]) -``` -
+>[!IMPORTANT] +> Practice 1: Locate the specific line in the stack trace where Oppia's code is directly involved in causing the failure. This is the point where the error is most likely to originate within the Oppia codebase. +> +> **Hint**: Look for the first occurrence (from the bottom of the stack trace upward) where the stack trace shows a line of code from the Oppia codebase. This line represents the entry point in the Oppia code that led to the failure. Identifying this line will help you trace the error back to its origin in the code. -> [!IMPORTANT] -> Practice 9: Based on the above test cases, try to write the remaining cases on your own: -> -> Topic Model with Summary Model: Ensure that the job does not report an issue when the topic model has a corresponding summary model. -> Multiple Topic Models with and without Summary Models: Test a mix of topic models that have and do not have corresponding summary models. -
-Test Case: Topic Model with Summary Model +In this stack trace, the relevant line is: ```python -def test_topic_model_with_summary_model(self) -> None: - # Create a TopicSummaryModel for the existing TopicModel - topic_summary_model = self.create_model( - topic_models.TopicSummaryModel, - id=self.TOPIC_ID, - canonical_name=self.CANONICAL_NAME, - name=self.NAME, - language_code=self.LANGUAGE_CODE, - url_fragment=self.URL_FRAGMENT, - description='dummy description', - canonical_story_count=0, - additional_story_count=0, - total_skill_count=0, - total_published_node_count=0, - uncategorized_skill_count=0, - subtopic_count=0, - version=1, - published_story_exploration_mapping={ - self.story_id_1: [], - self.story_id_2: [], - self.story_id_3: [] - }, - topic_model_last_updated=datetime.datetime.utcnow(), - topic_model_created_on=datetime.datetime.utcnow() - ) - - # Update timestamps and store the model - topic_summary_model.update_timestamps() - self.put_multi([topic_summary_model]) - self.create_topic_model(self.TOPIC_ID) - - # Run the job and assert expected output - self.assert_job_output_is([ - job_run_result.JobRunResult( - stdout='CountValidTopicModels SUCCESS: 1' - ) - ]) +File "/workspace/core/domain/suggestion_services.py", line 1408, in _get_plain_text_from_html_content_string html_content_string_with_rte_tags_replaced = re.sub( ``` -
-
-Test Case: Multiple Topic Models with and without Summary Models +This line indicates that the error originated in the `_get_plain_text_from_html_content_string` function within the `suggestion_services.py` file. -```python -def test_multiple_topic_models_with_and_without_summary_models(self) -> None: - """Test a mix of topic models with and without corresponding summary models.""" - # Create a topic model without a summary model - self.create_topic_model(self.TOPIC_ID + '_1') - - # Create a topic model with a summary model - self.create_topic_model(self.TOPIC_ID + '_2') - topic_summary_model_2 = self.create_model( - topic_models.TopicSummaryModel, - id=self.TOPIC_ID + '_2', - canonical_name=self.CANONICAL_NAME, - name=self.NAME, - language_code=self.LANGUAGE_CODE, - url_fragment=self.URL_FRAGMENT, - description='dummy description', - canonical_story_count=0, - additional_story_count=0, - total_skill_count=0, - total_published_node_count=0, - uncategorized_skill_count=0, - subtopic_count=0, - version=1, - published_story_exploration_mapping={ - self.story_id_1: [], - self.story_id_2: [], - self.story_id_3: [] - }, - topic_model_last_updated=datetime.datetime.utcnow(), - topic_model_created_on=datetime.datetime.utcnow() - ) - topic_summary_model_2.update_timestamps() - - # Create another topic model with a summary model - self.create_topic_model(self.TOPIC_ID + '_3') - topic_summary_model_3 = self.create_model( - topic_models.TopicSummaryModel, - id=self.TOPIC_ID + '_3', - canonical_name=self.CANONICAL_NAME, - name=self.NAME, - language_code=self.LANGUAGE_CODE, - url_fragment=self.URL_FRAGMENT, - description='dummy description', - canonical_story_count=0, - additional_story_count=0, - total_skill_count=0, - total_published_node_count=0, - uncategorized_skill_count=0, - subtopic_count=0, - version=1, - published_story_exploration_mapping={ - self.story_id_1: [], - self.story_id_2: [], - self.story_id_3: [] - }, - topic_model_last_updated=datetime.datetime.utcnow(), - topic_model_created_on=datetime.datetime.utcnow() - ) - topic_summary_model_3.update_timestamps() - self.put_multi([topic_summary_model_2, topic_summary_model_3]) - - # Run the job and assert expected output - self.assert_job_output_is([ - job_run_result.JobRunResult( - stderr='Invalid TopicModel with id: %s' % (self.TOPIC_ID + '_1') - ), - job_run_result.JobRunResult( - stdout='CountInvalidTopicModels SUCCESS: 1' - ), - job_run_result.JobRunResult( - stdout='CountValidTopicModels SUCCESS: 2' - ) - ]) -``` +You have identified the error location in the codebase and gathered initial clues about what might have gone wrong. -
+Next Step: Pinpoint the specific part of the code causing the error. -
-Here is the complete code for the unit tests: +>[!IMPORTANT] +> Practice 2: Locate the `_get_plain_text_from_html_content_string` function in the `suggestion_services.py` file. +> +> **Hint**: To locate the function, you can make use of search functionalities of your code editor. [Tips for Common IDEs](https://github.com/oppia/oppia/wiki/Tips-for-common-IDEs) ```python -"""Unit tests for jobs.batch_jobs.topic_validation_jobs.""" - -from __future__ import annotations - -from core import feconf -from core.jobs import job_test_utils -from core.jobs.batch_jobs import topic_validation_jobs -from core.jobs.types import job_run_result -from core.platform import models -from core.domain import topic_domain - -import datetime - -from typing import Final, Type - -MYPY = False -if MYPY: - from mypy_imports import topic_models - -(topic_models,) = models.Registry.import_models([models.Names.TOPIC]) - - -class ValidateTopicModelsJobTests(job_test_utils.JobTestBase): - - JOB_CLASS: Type[topic_validation_jobs.ValidateTopicModelsJob] = topic_validation_jobs.ValidateTopicModelsJob - - TOPIC_ID: Final = 'topic_id' - story_id_1: Final = 'story_id_1' - story_id_2: Final = 'story_id_2' - story_id_3: Final = 'story_id_3' - - def setUp(self) -> None: - super().setUp() - self.CANONICAL_NAME = 'canonical_name' - self.LANGUAGE_CODE = 'en' - self.NAME = 'name' - self.NEXT_SUBTOPIC_ID = 1 - self.PAGE_TITLE_FRAGMENT_FOR_WEB = 'page_title_fragment_for_web' - self.STORY_REFERENCE_SCHEMA_VERSION = 1 - self.SUBTOPIC_SCHEMA_VERSION = 1 - self.URL_FRAGMENT = 'url_fragment' - - def create_topic_model(self, id, **kwargs): - default_kwargs = { - 'canonical_name': self.CANONICAL_NAME, - 'language_code': self.LANGUAGE_CODE, - 'name': self.NAME, - 'next_subtopic_id': self.NEXT_SUBTOPIC_ID, - 'page_title_fragment_for_web': self.PAGE_TITLE_FRAGMENT_FOR_WEB, - 'story_reference_schema_version': self.STORY_REFERENCE_SCHEMA_VERSION, - 'subtopic_schema_version': self.SUBTOPIC_SCHEMA_VERSION, - 'url_fragment': self.URL_FRAGMENT, - } - default_kwargs.update(kwargs) - topic_model = self.create_model(topic_models.TopicModel, id=id, **default_kwargs) - topic_model.update_timestamps() - - first_topic_rights_model = self.create_model( - topic_models.TopicRightsModel, - id=id, - topic_is_published=False - ) - first_topic_rights_model.commit( - feconf.SYSTEM_COMMITTER_ID, - 'Create topic rights', - [{'cmd': topic_domain.CMD_CREATE_NEW}] - ) - - topic_model.commit( - feconf.SYSTEM_COMMITTER_ID, - 'Create topic rights', - [{'cmd': topic_domain.CMD_CREATE_NEW}] - ) - - def test_empty_storage(self) -> None: - self.assert_job_output_is_empty() - - def test_topic_model_without_summary_model(self) -> None: - # Create a topic model without a corresponding summary model - self.create_topic_model(self.TOPIC_ID) - - # Run the job and assert expected output - - self.assert_job_output_is([ - job_run_result.JobRunResult( - stdout='', - stderr='Invalid TopicModel with id: %s' % self.TOPIC_ID - ), - job_run_result.JobRunResult( - stdout='CountInvalidTopicModels SUCCESS: 1', - stderr='' - ) - ]) - - def test_topic_model_with_summary_model(self) -> None: - topic_summary_model = self.create_model( - topic_models.TopicSummaryModel, - id=self.TOPIC_ID, - canonical_name=self.CANONICAL_NAME, - name=self.NAME, - language_code=self.LANGUAGE_CODE, - url_fragment=self.URL_FRAGMENT, - description='dummy description', - canonical_story_count=0, - additional_story_count=0, - total_skill_count=0, - total_published_node_count=0, - uncategorized_skill_count=0, - subtopic_count=0, - version=1, - published_story_exploration_mapping={ - self.story_id_1: [], - self.story_id_2: [], - self.story_id_3: [] - }, - topic_model_last_updated=datetime.datetime.utcnow(), - topic_model_created_on=datetime.datetime.utcnow() - ) - - topic_summary_model.update_timestamps() - self.put_multi([topic_summary_model]) - self.create_topic_model(self.TOPIC_ID) - - self.assert_job_output_is([ - job_run_result.JobRunResult( - stdout='CountValidTopicModels SUCCESS: 1' - ) - ]) - - def test_multiple_topic_models_with_and_without_summary_models(self) -> None: - self.create_topic_model(self.TOPIC_ID + '_1') - self.create_topic_model(self.TOPIC_ID + '_2') - topic_summary_model_2 = self.create_model( - topic_models.TopicSummaryModel, - id=self.TOPIC_ID + '_2', - canonical_name=self.CANONICAL_NAME, - name=self.NAME, - language_code=self.LANGUAGE_CODE, - url_fragment=self.URL_FRAGMENT, - description='dummy description', - canonical_story_count=0, - additional_story_count=0, - total_skill_count=0, - total_published_node_count=0, - uncategorized_skill_count=0, - subtopic_count=0, - version=1, - published_story_exploration_mapping={ - self.story_id_1: [], - self.story_id_2: [], - self.story_id_3: [] - }, - topic_model_last_updated=datetime.datetime.utcnow(), - topic_model_created_on=datetime.datetime.utcnow() - ) - topic_summary_model_2.update_timestamps() - self.create_topic_model(self.TOPIC_ID + '_3') - topic_summary_model_3 = self.create_model( - topic_models.TopicSummaryModel, - id=self.TOPIC_ID + '_3', - canonical_name=self.CANONICAL_NAME, - name=self.NAME, - language_code=self.LANGUAGE_CODE, - url_fragment=self.URL_FRAGMENT, - description='dummy description', - canonical_story_count=0, - additional_story_count=0, - total_skill_count=0, - total_published_node_count=0, - uncategorized_skill_count=0, - subtopic_count=0, - version=1, - published_story_exploration_mapping={ - self.story_id_1: [], - self.story_id_2: [], - self.story_id_3: [] - }, - topic_model_last_updated=datetime.datetime.utcnow(), - topic_model_created_on=datetime.datetime.utcnow() - ) - topic_summary_model_3.update_timestamps() - self.put_multi([topic_summary_model_2, topic_summary_model_3]) - - self.assert_job_output_is([ - job_run_result.JobRunResult( - stderr='Invalid TopicModel with id: %s' % (self.TOPIC_ID + '_1') - ), - job_run_result.JobRunResult( - stdout='CountInvalidTopicModels SUCCESS: 1' - ), - job_run_result.JobRunResult( - stdout='CountValidTopicModels SUCCESS: 2' - ) - ]) +File "/workspace/core/domain/suggestion_services.py", line 1408, in _get_plain_text_from_html_content_string html_content_string_with_rte_tags_replaced = re.sub( File "/layers/google.python.runtime/python/lib/python3.8/re.py", line 210, in sub return _compile(pattern, flags).sub(repl, string, count) TypeError: expected string or bytes-like object ``` -
- -## Stage 6: Run and Validate the Job - -Now let’s try running the job on our local server. - -1. Sign in as an administrator ([instructions](https://github.com/oppia/oppia/wiki/How-to-access-Oppia-webpages#log-in-as-a-super-administrator)). -2. Navigate to Admin Page > Roles Tab. -3. Add the "Release Coordinator" role to the username you are signed in with. -4. Navigate to [http://localhost:8181/release-coordinator](http://localhost:8181/release-coordinator), then to the Beam Jobs tab. -5. Search for your job and then click the Play button. -6. Click "Start a new job." - -For the above step, we didn’t create any dummy data; thus, it covers a scenario similar to our first unit test. Let’s create dummy data and see how our job works for that. Generate dummy data by following the steps mentioned in this wiki page: [Populating data on local server](https://github.com/oppia/oppia/wiki/Populating-data-on-local-server#maths-classroom-page). - -> **Practice 10:** Now let's create dummy data to see how our job works with different scenarios. -> -> Follow the steps mentioned in this [wiki page](https://github.com/oppia/oppia/wiki/Populating-data-on-local-server#maths-classroom-page) to generate dummy data on your local server. -> Re-run the job following the same steps as before. - -The beam job doesn’t report any topic ID with missing topic summary models. That is because, by default, all the topic models are accompanied by their corresponding `TopicSummaryModel`. - -> **Practice 11:** To test the job's ability to identify missing `TopicSummaryModel`: -> -> - Set up a local datastore using DSAdmin by following [these instructions](https://github.com/oppia/oppia/wiki/Debugging-datastore-locally). -> - Manually delete one of the `TopicSummaryModel` entries. -> - Re-run the job to check if it correctly identifies the missing `TopicSummaryModel`. - -The beam job will report the topic ID for which you deleted the topic-summary model this time. - -> **Practice 12:** Now that you have a good understanding of Beam jobs at Oppia, let's add one more case to cover. Your job should also report any `TopicSummaryModel` that lacks a corresponding `TopicModel`. -> -> **Hint:** You'll need to adjust the job logic to include a check for `TopicSummaryModel` entries without corresponding `TopicModel` entries. Ensure that the unit tests also cover this scenario. Test the job by manually creating or deleting entries in your local datastore to mimic this scenario. - -## Conclusion - -Congratulations! You've written a Beam job that validates the consistency between `TopicModel` and `TopicSummaryModel`. You have also learned how to test your job with various scenarios to ensure its correctness. - -For further reading and more complex scenarios, refer to the [Apache Beam documentation](https://beam.apache.org/documentation/) and [Oppia's developer guides](https://github.com/oppia/oppia/wiki). - -### Additional Steps for Production Deployment - -> **Note:** If you want to get this job run and deployed in production, there are additional steps to follow. These steps ensure that your job runs smoothly in a live environment and produces the expected results. - -Before a job can be run and deployed in production, it must first be tested on the Oppia backup server. Here are the requirements for a job to be considered "fully tested": - -- **No Failures:** The job should run without failures on the Oppia backup server. -- **Expected Output:** The job produces the expected output. -- **Expected Outcome:** The job has the expected outcome, which must be verified by user-facing changes, a validation job, or an output check. -- **Approval:** The job should be explicitly approved by the server jobs admin. - -To understand the full process of testing Beam jobs at Oppia, please refer to the [Oppia Wiki on Testing Jobs](https://github.com/oppia/oppia/wiki/Testing-jobs-and-other-features-on-production). This wiki page includes detailed instructions and a template for submitting your job for production deployment. The template provides a structured way to request testing and approval, and the linked doc shows how to fill it out specifically for this job. - -By following these steps, you'll ensure that your Beam job is ready for production and can be deployed to help maintain the integrity and consistency of data in Oppia. - -## We Value Your Feedback +Based on the error logs, the issue arises because the argument passed to the re.sub() function in the _get_plain_text_from_html_content_string method is not of the expected data type (a string or bytes-like object). -Did you find this tutorial useful? Or, did you encounter any issues or find things hard to grasp? Let us know by opening a discussion on [GitHub Discussions](https://github.com/oppia/oppia/discussions/new?category=tutorial-feedback). We would be happy to help you and make improvements as needed! +You have now pinpointed the exact file, function, and line where the issue originated and are ready to dive deeper into the problem area. From 927d6ea454a3590b79294cbe51ed384582735c6f Mon Sep 17 00:00:00 2001 From: Ashwath Kannan Date: Fri, 4 Oct 2024 01:43:17 +0530 Subject: [PATCH 3/9] STAGE 2 --- ...the-reproduction-steps-for-server-error.md | 159 +++++++++++++++++- 1 file changed, 158 insertions(+), 1 deletion(-) diff --git a/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md b/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md index ff58434d..2075c1df 100644 --- a/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md +++ b/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md @@ -81,7 +81,7 @@ When faced with a server error, developers at Oppia typically follow these steps **Document Your Findings:** Once you've identified and confirmed the cause of the error, document your findings in detail on the original GitHub issue. Provide a summary of the error, clear steps to reproduce it, and any relevant observations or logs to support your conclusions. ### Stage 1: Analyze the Error Logs to Locate the Affected Code -**Objective**: Identify where the error occurred in the code. +> **Objective**: Identify where the error occurred in the code. > **Note**: This tutorial focuses on server errors, which are typically reported by server admins and come with error logs. For other issues, such as user-reported bugs, error logs might not always be available. In these cases, it is essential to ask users for "steps to reproduce the bug." If this information is not provided, we can contact the user to request additional details that will help us understand the issue better. @@ -150,3 +150,160 @@ File "/workspace/core/domain/suggestion_services.py", line 1408, in _get_plain_t Based on the error logs, the issue arises because the argument passed to the re.sub() function in the _get_plain_text_from_html_content_string method is not of the expected data type (a string or bytes-like object). You have now pinpointed the exact file, function, and line where the issue originated and are ready to dive deeper into the problem area. + +### Stage 2: Examine the Affected Code +> **Objective**: Analyze the affected code to understand its intended behavior and determine what might have gone wrong. + +Examine the _get_plain_text_from_html_content_string function to understand its purpose and the specific operation where the error occurs. + +
+ Here's the function: + +```python +def _get_plain_text_from_html_content_string(html_content_string: str) -> str: + """Retrieves the plain text from the given html content string. RTE element + occurrences in the html are replaced by their corresponding rte component + name, capitalized in square brackets. + eg:

Sample1 + Sample2

will give as output: Sample1 [Math] Sample2. + Note: similar logic exists in the frontend in format-rte-preview.filter.ts. + + Args: + html_content_string: str. The content html string to convert to plain + text. + + Returns: + str. The plain text string from the given html content string. + """ + + def _replace_rte_tag(rte_tag: Match[str]) -> str: + """Replaces all of the tags with their + corresponding rte component name in square brackets. + + Args: + rte_tag: MatchObject. A matched object that contains the + oppia-noninteractive rte tags. + + Returns: + str. The string to replace the rte tags with. + """ + # Retrieve the matched string from the MatchObject. + rte_tag_string = rte_tag.group(0) + # Get the name of the rte tag. The hyphen is there as an optional + # matching character to cover the case where the name of the rte + # component is more than one word. + rte_tag_name = re.search( + r'oppia-noninteractive-(\w|-)+', rte_tag_string) + # Here, rte_tag_name is always going to exists because the string + # that was passed in this function is always going to contain + # `` substring. So, to just rule out the + # possibility of None for mypy type checking. we used assertion here. + assert rte_tag_name is not None + # Retrieve the matched string from the MatchObject. + rte_tag_name_string = rte_tag_name.group(0) + # Get the name of the rte component. + rte_component_name_string_list = rte_tag_name_string.split('-')[2:] + # If the component name is more than word, connect the words with spaces + # to create a single string. + rte_component_name_string = ' '.join(rte_component_name_string_list) + # Captialize each word in the string. + capitalized_rte_component_name_string = ( + rte_component_name_string.title()) + formatted_rte_component_name_string = ' [%s] ' % ( + capitalized_rte_component_name_string) + return formatted_rte_component_name_string + + # Replace all the tags with their rte component + # names capitalized in square brackets. + html_content_string_with_rte_tags_replaced = re.sub( + r']+>(.*?)]+>', + _replace_rte_tag, html_content_string) + # Get rid of all of the other html tags. + plain_text = html_cleaner.strip_html_tags( + html_content_string_with_rte_tags_replaced) + # Remove trailing and leading whitespace and ensure that all words are + # separated by a single space. + plain_text_without_contiguous_whitespace = ' '.join(plain_text.split()) + return plain_text_without_contiguous_whitespace +``` + +
+ +> [!IMPORTANT] +> Practice 3: Locate the lines which are causing the error in the above mentioned function. +> +> **Hint**: Review the error logs. In a real debugging scenario, the stack trace would help you pinpoint this specific line as the source of the error. + +In this stack trace, the relevant line is: + +```python +File "/workspace/core/domain/suggestion_services.py", line 1408, in _get_plain_text_from_html_content_string html_content_string_with_rte_tags_replaced = re.sub( +``` + +This line indicates that the error originated in the `_get_plain_text_from_html_content_string` function within the `suggestion_services.py` file at line number 1408. Let’s take a look at that line: + +```python +html_content_string_with_rte_tags_replaced = re.sub( r']+>(.*?)]+>', + _replace_rte_tag, html_content_string) +``` + +Note that the third parameter, `html_content_string`, is being passed to the `re.sub` function and that this parameter is the argument of the function `_get_plain_text_from_html_content_string`. + +The `re.sub` function is used to replace certain HTML tags. According to the Python documentation for `re.sub`, it expects the first argument to be a pattern, the second argument to be a replacement string, and the third argument to be the string to be searched and replaced. + +Even though the re.sub [documentation](https://docs.python.org/3/library/re.html#re.sub) doesn't explicitly mention the TypeError, we can infer the following: +- **TypeError Convention**: In Python, TypeError is conventionally raised to indicate that an operation or function received an argument of an inappropriate type. This means that if re.sub expects a string or bytes-like object but receives a different type, it raises a TypeError. +- **Error Message**: The error message "expected string or bytes-like object" indicates that re.sub was given an argument that wasn't a string or bytes-like object, which is why it raised a `TypeError`. + +> [!IMPORTANT] +> **Practice 4**: Determine where the `html_content_string` is coming from. +> +> **Hint**: Review the error logs carefully. Pay close attention to the lines mentioned in the stack trace, especially the one that is calling the `_get_plain_text_from_html_content_string` function. This will give you a clue about where the `html_content_string` originates. + +According to the error log, this function is called in `suggestion_services.py` within the `_generate_translation_contributor_certificate_data` function. + +```python +# Retrieve the html content that is emphasized on the +# Contributor Dashboard pages. This content is what stands +# out for each suggestion when a user views a list of +# suggestions. +get_html_representing_suggestion = ( + SUGGESTION_EMPHASIZED_TEXT_GETTER_FUNCTIONS[ + Suggestion.suggestion_type] + ) +plain_text = _get_plain_text_from_html_content_string( +get_html_representing_suggestion(suggestion)) +``` + +As we can see, we are passing the output of the function, `get_html_representing_suggestion(suggestion)` as the input to the function `_get_plain_text_from_html_content_string`. + +Let’s take a closer look at the method `get_html_representing_suggestion(suggestion)`. This method handles different types of suggestions and retrieves the corresponding HTML content based on the type of suggestion. + +In the context of this method, two lambda functions are defined to extract HTML content based on the `suggestion_type`: + +**For translation suggestions:** +```python +SUGGESTION_TRANSLATE_CONTENT_HTML: Callable[ + [suggestion_registry.SuggestionTranslateContent], str +] = lambda suggestion: suggestion.change_cmd.translation_html +``` +This lambda function is used when the suggestion type is related to translating content. It retrieves the translation_html property from the change_cmd attribute of the suggestion object. + +**For question suggestions:** +```python +SUGGESTION_ADD_QUESTION_HTML: Callable[ + [suggestion_registry.SuggestionAddQuestion], str +] = lambda suggestion: suggestion.change_cmd.question_dict[ + 'question_state_data']['content']['html'] +``` + +This lambda function is used when the suggestion type involves adding a question. It retrieves the `html` property from the `question_state_data` dictionary inside the `question_dict` of the suggestion object. + +> [!IMPORTANT] +> Practice 5: Based on the code provided and the context of the lambda functions, what could potentially go wrong with the data being accessed? Specifically, think about the structure of the suggestion.change_cmd.question_dict['question_state_data']['content']['html'] and suggestion.change_cmd.translation_html fields. What potential data type or structure issues could arise here? + +One possible issue is that the `html` property of the `question_dict` or the `translation_html` field might not always be a string as expected. It could be `None` or another data type entirely, which would cause issues when the `_get_plain_text_from_html_content_string` function tries to process it. If the function is expecting a string but receives a different type (such as `None` or a more complex object), it could result in a `TypeError`. This could explain why the error indicates that a string or bytes-like object was expected but something else was provided. + +In this case, the issue is likely that either `suggestion.change_cmd.translation_html` or `suggestion.change_cmd.question_dict['question_state_data']['content']['html']` is not always a string, leading to the observed error when passed to the `re.sub()` function. + +> You now understand the purpose and expected behavior of the affected code and have identified areas where discrepancies may have occurred. \ No newline at end of file From ac73c1b1ca03ab475eb0b1d997baa26115d1bab2 Mon Sep 17 00:00:00 2001 From: Ashwath Kannan Date: Fri, 4 Oct 2024 02:03:00 +0530 Subject: [PATCH 4/9] Stage 3 --- ...the-reproduction-steps-for-server-error.md | 198 +++++++++++++++++- 1 file changed, 196 insertions(+), 2 deletions(-) diff --git a/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md b/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md index 2075c1df..0ce1e3fa 100644 --- a/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md +++ b/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md @@ -256,7 +256,7 @@ Even though the re.sub [documentation](https://docs.python.org/3/library/re.html - **Error Message**: The error message "expected string or bytes-like object" indicates that re.sub was given an argument that wasn't a string or bytes-like object, which is why it raised a `TypeError`. > [!IMPORTANT] -> **Practice 4**: Determine where the `html_content_string` is coming from. +> Practice 4: Determine where the `html_content_string` is coming from. > > **Hint**: Review the error logs carefully. Pay close attention to the lines mentioned in the stack trace, especially the one that is calling the `_get_plain_text_from_html_content_string` function. This will give you a clue about where the `html_content_string` originates. @@ -306,4 +306,198 @@ One possible issue is that the `html` property of the `question_dict` or the `tr In this case, the issue is likely that either `suggestion.change_cmd.translation_html` or `suggestion.change_cmd.question_dict['question_state_data']['content']['html']` is not always a string, leading to the observed error when passed to the `re.sub()` function. -> You now understand the purpose and expected behavior of the affected code and have identified areas where discrepancies may have occurred. \ No newline at end of file +> You now understand the purpose and expected behavior of the affected code and have identified areas where discrepancies may have occurred. + +### Stage 3: Investigate Potential Causes by Exploring the Code in Depth +> **Objective**: Formulate hypotheses about why the error is occurring based on initial findings, and verify them by examining the code more closely. + +At this point, brainstorm the possible data types that could be causing this issue. Consider the following questions: +- What Data Types Are Expected? + - What data types should suggestion.change_cmd.question_dict['question_state_data']['content']['html'] and suggestion.change_cmd.translation_html be? Are they always strings, or can they also be other types? + +- Can Non-String Values Cause Issues? + - Could these fields sometimes store non-string values, such as lists or None? If so, this could lead to issues when the _get_plain_text_from_html_content_string function attempts to process them. + +> You now have potential causes of the error. To confirm these, you need to dig deeper into the codebase. + +#### Investigating the Source of the Problem + +Our initial suspicion is that the problem might be due to the data types involved. To investigate further, we need to: + +- Check the Creation of Suggestions: Examine the code responsible for creating suggestions and questions to ensure that the expected data types are being used. This includes verifying that fields are populated correctly and checking for any anomalies that might lead to errors. + +> [!IMPORTANT] +> Practice 6: Locate the Code Responsible for Creating Suggestions +> +> **Hint**: Analyze the network requests triggered when creating a suggestion, and then examine the handler attached to the endpoint. Use the resources such as Find the Right Code to Change and Analyzing the Codebase to guide your investigation. + +Open the Developer Tools in your browser and go to the "Network" tab while creating a translation suggestion. You'll notice that the endpoint `/suggestionhandler` is triggered. +Next, check the `main.py`file in the Oppia codebase to find which handler is associated with this endpoint. You'll see that the endpoint is connected to: + +```python +get_redirect_route( + r'%s/' % feconf.SUGGESTION_URL_PREFIX, + suggestion.SuggestionHandler), +``` + +> [!IMPORTANT] +> Practice 7: Analyze the `SuggestionHandler` in `oppia/core/controllers/suggestion.py`. +> Can you find the code that is responsible for creating suggestions? + +Within `oppia/core/controllers/suggestion.py`, you'll see that suggestion creation is managed by this method: + +```python +suggestion = suggestion_services.create_suggestion( + suggestion_type, + self.normalized_payload['target_type'], + self.normalized_payload['target_id'], + self.normalized_payload['target_version_at_submission'], + self.user_id, + self.normalized_payload['change_cmd'], + self.normalized_payload['description'] +) +``` + +This shows that the actual creation of the suggestion is handled by the `create_suggestion` method in the `suggestion_services.py` file. + +To further investigate, we need to and review the [Suggestion Creation Services](https://github.com/oppia/oppia/blob/6c2003519db2981309b6d057af3edc7372da005e/core/domain/suggestion_services.py#L150) method to understand how fields like `content_html` are populated. + +> [!IMPORTANT] +> Practice 8: Analyze the create_suggestion method of the suggestion_services.py file. Can you find the code where content_html field is being populated/used ? + +On reviewing the method, we can see at this [line](https://github.com/oppia/oppia/blob/6c2003519db2981309b6d057af3edc7372da005e/core/domain/suggestion_services.py#L215) that for suggestions of type 'Translations,' we are getting content_html using this get_content_html and are performing a equality check against change_cmd dict’s content_html property. This might be what we are looking for. Let’s take a look at this method + +```python +def get_content_html( + self, state_name: str, content_id: str + ) -> Union[str, List[str]]: + """Return the content for a given content id of a state. + + Args: + state_name: str. The name of the state. + content_id: str. The id of the content. + + Returns: + str. The html content corresponding to the given content id of a state. + + Raises: + ValueError. The given state_name does not exist. + """ + if state_name not in self.states: + raise ValueError('State %s does not exist' % state_name) + + return self.states[state_name].get_content_html(content_id) +``` + +Here, we notice a discrepancy between the type annotation `(-> Union[str, List[str]])` and the docstring. The docstring states that the method returns a `str`, but the type annotation indicates that it could return either a `str` or a `List[str]`. This is a good example of why you shouldn't rely solely on docstrings to understand a function's behavior. Docstrings can become outdated or inaccurate over time, especially in large codebases like Oppia, where many contributors make changes. + +#### Verify the Return Type by Analyzing the Code + +To accurately determine the return type, you need to look at the actual implementation of the `get_content_html` method. Here, the method calls `get_content_html(content_id)` on `self.states[state_name]`. To understand what is being returned, you need to trace this call further into the `self.states[state_name]` object and its `get_content_html` method. + +If we look further into this code: +```python +return self.states[state_name].get_content_html(content_id) +``` + +This function eventually returns the value: +```python +return content_id_to_translatable_content[content_id].content_value +``` + +Here, the content_value property is populated via: +```python +self.content_value = content_value +``` + +The type of content_value is defined as: +```python +content_value: feconf.ContentValueType +``` + +Upon examining ContentValueType, we find: +```python +# The data type for the translated or translatable content in any +# BaseTranslatableObject. +ContentValueType = Union[str, List[str]] +``` + +By investigating the return types of each function and following the flow of data through the helper methods, we confirm that the `get_content_html` method can indeed return either a `str` or a `List[str]`. However, from examining the backend functions alone, it's not entirely clear whether the `content_html` variable actually ends up being a `List[str]` or just a `str`. + +The evidence we have gathered so far is suggestive, but not definitive. To gather more solid proof, we need to trace where the `content_html` property in the change_cmd dictionary is actually populated. + +Looking at the `SuggestionHandler`: +```python +suggestion = suggestion_services.create_suggestion( + suggestion_type, + self.normalized_payload['target_type'], + self.normalized_payload['target_id'], + self.normalized_payload['target_version_at_submission'], + self.user_id, + self.normalized_payload['change_cmd'], + self.normalized_payload['description'] + ) +``` + +This snippet indicates that the `content_html` is part of the `change_cmd` dictionary, which is passed from the frontend to the backend. This means the backend isn't strictly defining what the type of `content_html` should be; rather, it's receiving it from the frontend. + +To gain more clarity, we need to examine the network calls in the frontend that send data to the `SuggestionHandler`. This can help us trace the source of the content_html value. + +In the frontend, we find the function `suggestTranslatedTextAsync`, which is responsible for making a call to the handler: +```python +async suggestTranslatedTextAsync( + expId: string, + expVersion: string, + contentId: string, + stateName: string, + languageCode: string, + contentHtml: string | string[], + translationHtml: string | string[], + imagesData: ImagesData[], + dataFormat: string +): Promise { + const postData: Data = { + suggestion_type: 'translate_content', + target_type: 'exploration', + description: 'Adds translation', + target_id: expId, + target_version_at_submission: expVersion, + change_cmd: { + cmd: 'add_written_translation', + content_id: contentId, + state_name: stateName, + language_code: languageCode, + content_html: contentHtml, + translation_html: translationHtml, + data_format: dataFormat, + }, + files: await this.imageLocalStorageService.getFilenameToBase64MappingAsync(imagesData), + }; + const body = new FormData(); + body.append('payload', JSON.stringify(postData)); + return this.http.post('/suggestionhandler/', body).toPromise(); +} +``` + +Here, we can see that `content_html` can be either a `string` or a `List[string]` (array of strings). This confirms that the frontend can indeed send a `List[string]` as the value for `content_html`. Thus, the `content_html` property for translation suggestions could indeed be either a single string or a list of strings. + +We have identified the root cause of the error. The `_get_plain_text_from_html_content_string` function expects a `str`.When it receives a list of strings, it results in a `TypeError`. + +**Note on Why mypy Did Not Catch This Error** +`mypy` is a static type checker for Python, designed to enforce type safety by verifying that types declared in type annotations are respected throughout the code. However, `mypy` has limitations when dynamic typing is involved, especially with the use of functions like `setattr` that dynamically set attributes at runtime. + +In the `create_suggestion` method, the `change_cmd` parameter is annotated with `Mapping[str, change_domain.AcceptableChangeDictTypes]`. The `AcceptableChangeDictTypes` union type includes a variety of types such as `str`, `bool`, `float`, `int`, `None`, `List[str]`, and several domain-specific dictionary types. This annotation indicates that `change_cmd` can contain any of these types. + +Within the `BaseChange` class, the `setattr` function is used to dynamically set attributes based on the contents of the `change_dict` +```python +for attribute_name in cmd_attribute_names: + setattr(self, attribute_name, change_dict.get(attribute_name)) +``` + +Normally, if a variable were declared with the type `AcceptableChangeDictTypes` and later passed to a function expecting a `str`, `mypy` would flag this as a potential error because `AcceptableChangeDictTypes` can also include types like `List[str]`, which are not compatible with a `str` type. + +However, in this case, `setattr` is used to assign a value from `change_dict` which could be of any type included in `AcceptableChangeDictTypes` to an attribute that is declared to be of type str. Since `setattr` is a built-in function that dynamically assigns values to object attributes, `mypy` cannot enforce type safety at this point. It assumes that the attribute, which is declared as `str`, will always be of type `str`, even though it might actually hold a `List[str]` or another type from `AcceptableChangeDictTypes`. + +This is why `mypy` does not catch the type mismatch when a function like `_get_plain_text_from_html_content_string`, which expects a `str`, encounters an attribute that is, in fact, a `List[str]`. The use of `setattr` bypasses `mypy's` static type checking, leading to potential runtime type errors that `mypy` does not detect. + +> You have pinpointed the exact cause of the error in the code, understanding why the issue occurs under certain conditions. From f9d21b17c15d530bf430561ad2dd871d65f63783 Mon Sep 17 00:00:00 2001 From: Ashwath Kannan Date: Fri, 4 Oct 2024 02:14:35 +0530 Subject: [PATCH 5/9] Stage 4 --- ...the-reproduction-steps-for-server-error.md | 139 ++++++++++++++++++ 1 file changed, 139 insertions(+) diff --git a/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md b/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md index 0ce1e3fa..ae062e39 100644 --- a/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md +++ b/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md @@ -501,3 +501,142 @@ However, in this case, `setattr` is used to assign a value from `change_dict` wh This is why `mypy` does not catch the type mismatch when a function like `_get_plain_text_from_html_content_string`, which expects a `str`, encounters an attribute that is, in fact, a `List[str]`. The use of `setattr` bypasses `mypy's` static type checking, leading to potential runtime type errors that `mypy` does not detect. > You have pinpointed the exact cause of the error in the code, understanding why the issue occurs under certain conditions. + +### Stage 4: Reproduce the Error +> **Objective**: Confirm the suspected cause of the error by replicating it in a controlled environment. + +Once you have a hypothesis about the root cause of the issue, it's time to verify it. There are several ways you can do this: + +##### Option 1: Reproduce on a Local Server** + +Try to replicate the error on a local server by following the user journey that leads to the issue. This involves setting up a local environment, creating or modifying an exploration with rule inputs, and attempting to generate a certificate to see if the error occurs again. This method is practical and quick if you can accurately simulate user actions, but it may not always capture the exact conditions of the live environment. + +**Pros**: Practical and quick if you can closely simulate user actions. + +**Cons**: May not always replicate the exact conditions of the live environment. + +##### Option 2: Use Unit Tests + +Modify or create unit tests to check if they trigger the same error when using the rule inputs in question. This approach involves locating existing unit tests, such as those for the `generate_contributor_certificate_data` function, and adjusting them to use the problematic inputs. This method can be efficient if you already have a good understanding of the issue, but it relies on having relevant unit tests available or being able to adapt existing ones easily. + +**Pros**: Efficient if you have a clear idea of the issue. + +**Cons**: Requires existing or easily adaptable unit tests. + +##### Option 3: Write a Validation Job** + +Develop a Beam job to fetch all translations and check their data types, reporting any that are not strings. This approach involves creating and testing the job and then running it on a live server with the help of server admins. It's a thorough and systematic method, but it can be time-consuming and requires server-side execution. + +**Pros**: Comprehensive and systematic. + +**Cons**: Time-consuming and involves server-side execution. + +##### Option 4: Write a Validation Job** + +Insert `logging.error()` statements into the codebase to capture more detailed information when the error happens. By placing these logs around suspected areas of the code, you can gather data that helps you understand the problem better. However, this method requires reviewing server logs and might depend on waiting for the error to reoccur in production. + +**Pros**: Provides detailed context; useful for understanding complex issues when other methods fail. + +**Cons**: Requires additional review of server logs and depends on the error happening again in production. + +> [!IMPORTANT] +> Practice 9: Which option do you think is better and why? Consider: +> - Option 1 is ideal if the issue does not require specific production data. It's often the quickest way to validate your hypothesis. +> - Option 2 is preferable if the issue can be simulated with unit tests, especially when you have a clear understanding of the root cause. +> - Option 3 is useful when you need to perform a broad investigation of production data that you don't have locally, and you lack a clear lead on the issue. +> - Option 4 is helpful when other methods do not provide enough detail or you need to generate logs to diagnose the problem more precisely. + +In our case, Option 1 or Option 2 is ideal since we have a clear root cause: the HTML content can be either a string or a list of strings. We want to verify that this discrepancy causes the error. Option 3 can serve as a fallback if our direct testing methods do not fully replicate the error. + +#### General Rule for Choosing an Verifying Option + +As a general rule, start with the first item on the list above that makes sense for the issue you're tackling: +- Do Option 1 if the issue doesn't require specific production data. +- Do Option 2 if the issue requires specific production data, but you can simulate it with unit tests. +- Do Option 3 if you need to know what's in the production data in the first place. +- Do Option 4 if you have no idea what to do and need more detailed information. + +Let’s go with Option 2 for this tutorial. Per the error log, the error is being generated due to this line: + +```python +plain_text = _get_plain_text_from_html_content_string(get_html_representing_suggestion(suggestion)) +``` + +If we look carefully at the function: +```python +def _get_plain_text_from_html_content_string(html_content_string: str) -> str: + + # Rest of the code. + html_content_string_with_rte_tags_replaced = re.sub( + r']+>(.*?)]+>', + _replace_rte_tag, + html_content_string + ) + # Rest of the code. +``` + +Within the function, the value `html_content_string` is passed to `re.sub`. The `re.sub` function, as defined in Python's standard library, expects its third argument to be a string (or a bytes-like object). + +The error message "expected string or bytes-like object" suggests that `html_content_string` was not of the expected type when `re.sub` attempted to use it. This implies that the input passed to `_get_plain_text_from_html_content_string` was something other than a string or bytes-like object. And from the above investigation, we now suspect it to be a list of strings. + +> Note: Now that we understand why mypy did not catch this error, we can clarify that the type hint html_content_string: str in the function signature is actually accurate and not part of the issue. The type hint correctly indicates that the function expects a string argument. The problem arose because of how the attribute was dynamically assigned using setattr without a type check, leading to a case where a list of strings was mistakenly passed to this function. + +Let’s take a look at the unit tests +> [!IMPORTANT] +> Practice 10: Can you find the unit tests which could be used for our case? Check which unit tests are covering the behavior of generating certificates. + +In the unit test class `ContributorCertificateTests`, we have the method `test_create_translation_contributor_certificate`, which creates a dummy translation that serves as the input for the certificate generation method. As you can see here: +```python +change_cmd = { + 'cmd': 'add_translation', + 'content_id': 'content', + 'language_code': 'hi', + 'content_html': '', + 'state_name': 'Introduction', + 'translation_html': '

Translation for content.

' + } +``` +`translation_html` holds the suggested value for an exploration card and here we are passing the `HTML` string. Let’s try changing it from a string to a list of strings, as that’s what we are suspecting: + +```python + change_cmd = { + 'cmd': 'add_translation', + 'content_id': 'content', + 'language_code': 'hi', + 'content_html': '', + 'state_name': 'Introduction', + 'translation_html': ['

Translation for content.

'], + } +``` + +Now let’s run the backend unit tests by running `python -m scripts.run_backend_tests --test_target=core.domain.suggestion_services_test` (You can check out the wiki on how to run backend unit tests: https://github.com/oppia/oppia/wiki/Backend-tests) + +Notice that the tests are failing with a similar error message to what we saw from the production logs. + +```python +| SUMMARY OF TESTS | ++------------------+ + +====================================================================== +ERROR: test_create_translation_contributor_certificate (core.domain.suggestion_services_test.ContributorCertificateTests) +---------------------------------------------------------------------- +Traceback (most recent call last): + File "/Users/ash/Desktop/openSource/oppia/core/domain/suggestion_services_test.py", line 7351, in test_create_translation_contributor_certificate + suggestion_services.generate_contributor_certificate_data( + File "/Users/ash/Desktop/openSource/oppia/core/domain/suggestion_services.py", line 3952, in generate_contributor_certificate_data + data = _generate_translation_contributor_certificate_data( + File "/Users/ash/Desktop/openSource/oppia/core/domain/suggestion_services.py", line 4026, in _generate_translation_contributor_certificate_data + plain_text = _get_plain_text_from_html_content_string( + File "/Users/ash/Desktop/openSource/oppia/core/domain/suggestion_services.py", line 1462, in _get_plain_text_from_html_content_string + html_content_string_with_rte_tags_replaced = re.sub( + File "/Users/ash/.pyenv/versions/3.8.15/lib/python3.8/re.py", line 210, in sub + return _compile(pattern, flags).sub(repl, string, count) +TypeError: expected string or bytes-like object + +--------------------------------------------------------------------- +FAILED core.domain.suggestion_services_test: 1 errors, 0 failures +``` + +Thus, our suspicion appears to be correct. However, it's important to note that multiple bugs could produce the same error message, so while this provides strong supporting evidence, it doesn't conclusively confirm that this is the cause of the initial error. + +> You have successfully reproduced the error locally, validating your hypothesis and preparing to implement a solution. From 10c89f545dd32d233dd93e305027a0e799567d59 Mon Sep 17 00:00:00 2001 From: Ashwath Kannan Date: Fri, 4 Oct 2024 02:27:09 +0530 Subject: [PATCH 6/9] Stage 5 --- ...the-reproduction-steps-for-server-error.md | 53 +++++++++++++++++++ 1 file changed, 53 insertions(+) diff --git a/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md b/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md index ae062e39..010fca5b 100644 --- a/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md +++ b/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md @@ -640,3 +640,56 @@ FAILED core.domain.suggestion_services_test: 1 errors, 0 failures Thus, our suspicion appears to be correct. However, it's important to note that multiple bugs could produce the same error message, so while this provides strong supporting evidence, it doesn't conclusively confirm that this is the cause of the initial error. > You have successfully reproduced the error locally, validating your hypothesis and preparing to implement a solution. + +### Stage 5: Document Your Findings + +> Note: When tackling server errors at Oppia, it is essential to document your findings thoroughly on the issue thread. This practice not only ensures transparency in the debugging process but also enables other contributors to understand the progress, validate the issue, and collaborate effectively on finding a solution. + +#### Start with a Summary of the Error +- Provide a brief description of the server error you encountered. +- Include key details from the error logs and any initial observations. + +Example: + +I encountered a `TypeError: expected string or bytes-like object` in the `generate_contributor_certificate_data` method while accessing the Contributor Dashboard. The error occurs at line 1408 in the `suggestion_services.py` file, specifically within the `_get_plain_text_from_html_content_string` function, which calls `re.sub`. + +#### Detail the Steps Taken to Reproduce the Error +- Outline the steps you followed to reproduce the error locally. +- Mention the environment setup, data used, and any modifications made to the code. + +Example: + +To reproduce the error: +- I set up a local environment with the latest version of Oppia. +- I modified an existing backend test in `ContributorCertificateTests` to create a dummy translation suggestion. Specifically, I adjusted the `translation_html` field to be a list of strings instead of a single string, which matches the suspected cause of the error. +- I ran the backend tests using `python -m scripts.run_backend_tests --test_target=core.domain.suggestion_services_test`. +- The tests failed with a similar `TypeError` as observed in the production logs, confirming that the issue is reproducible locally. + +#### Identify the Commit or PR Likely to Have Introduced the Error: +- Find the commit or PR that might have caused the issue using the Oppia wiki guide. +- Mention the PR in your comment as a starting point for further investigation. + +Example: + +After examining recent changes, it appears that the issue might have been introduced in a PR that modified the `get_content_html` method to return `Union[str, List[str]]` instead of just a `str`. [Link to the PR](https://github.com/oppia/oppia/pull/17200/files#diff-6bff01224db1fe3047ddf87614d720deffd8efc7e3fca819408547f66926a541L2096) - Here, we are changing the return type of the get_content_html method from string to either strings/list. (It's not auto navigating, if you want to look, please check exp_domain file's line number 2096). + +#### Explain the Possible Root Causes: +- Describe your analysis of the potential causes. +- Explain why the changes in the identified PR might have led to the error. +- Provide any supporting information, such as error logs or specific observations. + +Example: + +The `get_content_html` method, which now returns `Union[str, List[str]]`, is being used in the `_get_plain_text_from_html_content_string function`. This function expects a `str`, but it sometimes receives a `List[str]`, causing a `TypeError`. The discrepancy between the expected input type (str) and the actual input type (List[str]) seems to be the root cause of the error. + +#### Suggest Next Steps: +- Recommend further testing, confirming the bug with other contributors, or starting work on a fix. +- Clearly outline what should happen next. + +Example: + +Review the `get_content_html` method and decide whether it should always return a str or update the `_get_plain_text_from_html_content_string` function to handle both `str` and `List[str]`. +Check with other team members to gather insights on whether this bug might affect other parts of the codebase. +If necessary, begin work on a fix by modifying the relevant functions to handle different data types appropriately. + +> By providing a clear and detailed comment on the issue thread, you have effectively communicated the problem and your findings to other contributors. This will help others understand the progress, reproduce the issue, and collaborate on finding a solution. From 5b27ca8bd6dcad7b070b778523d1c7d42c62a522 Mon Sep 17 00:00:00 2001 From: Ashwath Kannan Date: Fri, 4 Oct 2024 02:31:42 +0530 Subject: [PATCH 7/9] Conclusion --- ...the-reproduction-steps-for-server-error.md | 32 +++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md b/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md index 010fca5b..be9ab865 100644 --- a/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md +++ b/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md @@ -693,3 +693,35 @@ Check with other team members to gather insights on whether this bug might affec If necessary, begin work on a fix by modifying the relevant functions to handle different data types appropriately. > By providing a clear and detailed comment on the issue thread, you have effectively communicated the problem and your findings to other contributors. This will help others understand the progress, reproduce the issue, and collaborate on finding a solution. + +## Conclusion + +**Congratulations!** You've learned how to debug server errors that are typically challenging to reproduce due to limited information. You've also developed skills in reading stack traces and determining the steps needed to reproduce a bug. + +For more insight, you can explore some debugging stories we've listed on our [wiki](https://github.com/oppia/oppia/wiki/Debugging-Stories). To further deepen your understanding of debugging at Oppia, refer to our [Debugging Guide](https://github.com/oppia/oppia/wiki/Debugging). + +If you feel confident and want to apply the skills you've acquired from this tutorial, consider tackling one of the following issues: [Good First Issues for Server Errors](https://github.com/oppia/oppia/issues?q=is%3Aopen+is%3Aissue+label%3A%22server+errors%22+label%3A%22good+first+issue%22). + +### Tidy Up + +Now that you've completed the tutorial, it's important to tidy up your local repository to avoid any confusion in future development work. Follow these steps: + +- Switch Back to the Main Branch: Return to the main branch of the repository: +```bash +git checkout main +``` + +- Remove Detached State (if needed): If you checked out a specific commit and ended up in a detached HEAD state, you can safely delete any temporary changes by running: +```bash +git checkout -D 192f0a9a4866debac160015bc949130aaae6a7fe +``` +- Clean Up Untracked Files: If there are any untracked files or changes, you can remove them using: +```bash +git clean -fd +``` + +By following these steps, you'll have a clean working environment ready for future contributions or tutorials. + +### We Value Your Feedback + +Did you find this tutorial useful? Or, did you encounter any issues or find things hard to grasp? Let us know by opening a discussion on [GitHub Discussions](https://github.com/oppia/oppia/discussions/new?category=tutorial-feedback). We would be happy to help you and make improvements as needed! From 99303a6c70138f6faa746ab0d13777d66a943c80 Mon Sep 17 00:00:00 2001 From: Ashwath Kannan <98253080+Ash-2k3@users.noreply.github.com> Date: Fri, 4 Oct 2024 19:40:22 +0530 Subject: [PATCH 8/9] Add TOC and fix heading names --- ...the-reproduction-steps-for-server-error.md | 54 ++++++++++++++----- 1 file changed, 41 insertions(+), 13 deletions(-) diff --git a/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md b/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md index be9ab865..8bad2a56 100644 --- a/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md +++ b/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md @@ -1,17 +1,45 @@ -# Tutorial: Learn How to Figure Out the Reproduction Steps for a Server Error +## Table of Contents + +- [Introduction](#introduction) + * [Skills Covered](#skills-covered) + * [Scenario](#scenario) +- [Procedure](#procedure) + + [Setup](#setup) + + [Debugging Process](#debugging-process) + * [Stage 1: Analyze the Error Logs to Locate the Affected Code](#stage-1-analyze-the-error-logs-to-locate-the-affected-code) + + [Understanding the Stack Trace](#understanding-the-stack-trace) + * [Stage 2: Examine the Affected Code](#stage-2-examine-the-affected-code) + * [Stage 3: Investigate Potential Causes by Exploring the Code in Depth](#stage-3-investigate-potential-causes-by-exploring-the-code-in-depth) + + [Investigating the Source of the Problem](#investigating-the-source-of-the-problem) + + [Verify the Return Type by Analyzing the Code](#verify-the-return-type-by-analyzing-the-code) + * [Stage 4: Reproduce the Error](#stage-4-reproduce-the-error) + - [Option 1: Reproduce on a Local Server](#option-1-reproduce-on-a-local-server) + - [Option 2: Use Unit Tests](#option-2-use-unit-tests) + - [Option 3: Write a Validation Job](#option-3-write-a-validation-job) + - [Option 4: Add Logging for Detailed Insight](#option-4-add-logging-for-detailed-insight) + + [General Rule for Choosing an Verifying Option](#general-rule-for-choosing-an-verifying-option) + * [Stage 5: Document Your Findings](#stage-5-document-your-findings) + + [Start with a Summary of the Error](#start-with-a-summary-of-the-error) + + [Detail the Steps Taken to Reproduce the Error](#detail-the-steps-taken-to-reproduce-the-error) + + [Identify the Commit or PR Likely to Have Introduced the Error](#identify-the-commit-or-pr-likely-to-have-introduced-the-error) + + [Explain the Possible Root Causes](#explain-the-possible-root-causes) + + [Suggest Next Steps](#suggest-next-steps) +- [Conclusion](#conclusion) + * [Tidy Up](#tidy-up) + * [We Value Your Feedback](#we-value-your-feedback) ## Introduction This tutorial will guide you through debugging a server error that is challenging to reproduce locally. Specifically, we will investigate and fix a `TypeError` related to certificate generation for contributors. -### Skills Covered: +### Skills Covered - Codebase Navigation - Identifying and Analyzing Error Logs - Debugging Techniques - Reproducing Server Errors Locally -### Scenario: +### Scenario One of the server admins has reported the following error logs. Your task is to investigate the issue and determine how and why it is occurring. @@ -39,13 +67,13 @@ Traceback (most recent call last): TypeError: expected string or bytes-like object ``` -## Procedure: +## Procedure The following steps illustrate how a developer might tackle this issue. Try following this tutorial step-by-step on your local machine! This will give you a better sense of how to tackle other similar issues in the codebase. If you get stuck with a step in this tutorial, raise an issue in GitHub Discussions to get help. **Important**: When you see a “practice question box”, stop and try to figure out the answer on your own before reading ahead. You will learn more if you try to figure out your own answer to the question first! -#### Setup: +#### Setup **Install Oppia on Your Local Machine** To begin, you'll need to have Oppia installed on your local machine. If you haven't done so already, please follow the installation steps provided in this [wiki page](https://github.com/oppia/oppia/wiki). @@ -66,7 +94,7 @@ git log -1 The output should display the commit ID `192f0a9a4866debac160015bc949130aaae6a7fe`. -#### Debugging Process: +#### Debugging Process When faced with a server error, developers at Oppia typically follow these steps: @@ -109,7 +137,7 @@ Traceback (most recent call last): TypeError: expected string or bytes-like object ``` -#### Understanding the Stack Trace: +#### Understanding the Stack Trace A **stack trace** is a report of the active stack frames at a certain point in time during the execution of a program. It shows the call sequence leading to the point where the error occurred. Each line provides the file name, line number, and function where the error occurred. @@ -507,7 +535,7 @@ This is why `mypy` does not catch the type mismatch when a function like `_get_p Once you have a hypothesis about the root cause of the issue, it's time to verify it. There are several ways you can do this: -##### Option 1: Reproduce on a Local Server** +##### Option 1: Reproduce on a Local Server Try to replicate the error on a local server by following the user journey that leads to the issue. This involves setting up a local environment, creating or modifying an exploration with rule inputs, and attempting to generate a certificate to see if the error occurs again. This method is practical and quick if you can accurately simulate user actions, but it may not always capture the exact conditions of the live environment. @@ -523,7 +551,7 @@ Modify or create unit tests to check if they trigger the same error when using t **Cons**: Requires existing or easily adaptable unit tests. -##### Option 3: Write a Validation Job** +##### Option 3: Write a Validation Job Develop a Beam job to fetch all translations and check their data types, reporting any that are not strings. This approach involves creating and testing the job and then running it on a live server with the help of server admins. It's a thorough and systematic method, but it can be time-consuming and requires server-side execution. @@ -531,7 +559,7 @@ Develop a Beam job to fetch all translations and check their data types, reporti **Cons**: Time-consuming and involves server-side execution. -##### Option 4: Write a Validation Job** +##### Option 4: Add Logging for Detailed Insight Insert `logging.error()` statements into the codebase to capture more detailed information when the error happens. By placing these logs around suspected areas of the code, you can gather data that helps you understand the problem better. However, this method requires reviewing server logs and might depend on waiting for the error to reoccur in production. @@ -665,7 +693,7 @@ To reproduce the error: - I ran the backend tests using `python -m scripts.run_backend_tests --test_target=core.domain.suggestion_services_test`. - The tests failed with a similar `TypeError` as observed in the production logs, confirming that the issue is reproducible locally. -#### Identify the Commit or PR Likely to Have Introduced the Error: +#### Identify the Commit or PR Likely to Have Introduced the Error - Find the commit or PR that might have caused the issue using the Oppia wiki guide. - Mention the PR in your comment as a starting point for further investigation. @@ -673,7 +701,7 @@ Example: After examining recent changes, it appears that the issue might have been introduced in a PR that modified the `get_content_html` method to return `Union[str, List[str]]` instead of just a `str`. [Link to the PR](https://github.com/oppia/oppia/pull/17200/files#diff-6bff01224db1fe3047ddf87614d720deffd8efc7e3fca819408547f66926a541L2096) - Here, we are changing the return type of the get_content_html method from string to either strings/list. (It's not auto navigating, if you want to look, please check exp_domain file's line number 2096). -#### Explain the Possible Root Causes: +#### Explain the Possible Root Causes - Describe your analysis of the potential causes. - Explain why the changes in the identified PR might have led to the error. - Provide any supporting information, such as error logs or specific observations. @@ -682,7 +710,7 @@ Example: The `get_content_html` method, which now returns `Union[str, List[str]]`, is being used in the `_get_plain_text_from_html_content_string function`. This function expects a `str`, but it sometimes receives a `List[str]`, causing a `TypeError`. The discrepancy between the expected input type (str) and the actual input type (List[str]) seems to be the root cause of the error. -#### Suggest Next Steps: +#### Suggest Next Steps - Recommend further testing, confirming the bug with other contributors, or starting work on a fix. - Clearly outline what should happen next. From 6a01e0bcc653531a431af6a169f2cec5d7fa115d Mon Sep 17 00:00:00 2001 From: Ashwath Kannan Date: Fri, 4 Oct 2024 19:51:18 +0530 Subject: [PATCH 9/9] Add link in sidebar + Rename the tutorial name --- ...earn-to-figure-out-the-reproduction-steps-for-server-error.md | 0 _Sidebar.md | 1 + 2 files changed, 1 insertion(+) rename Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md => Tutorial:-Learn-to-figure-out-the-reproduction-steps-for-server-error.md (100%) diff --git a/Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md b/Tutorial:-Learn-to-figure-out-the-reproduction-steps-for-server-error.md similarity index 100% rename from Tutorial:-Learn-How-to-figure-out-the-reproduction-steps-for-server-error.md rename to Tutorial:-Learn-to-figure-out-the-reproduction-steps-for-server-error.md diff --git a/_Sidebar.md b/_Sidebar.md index f6edf28d..2e338f01 100644 --- a/_Sidebar.md +++ b/_Sidebar.md @@ -52,6 +52,7 @@ * [[Debug frontend code|Debug-frontend-code]] * [[Debugging Stories|Debugging-Stories]] * [[Testing email functionality|Testing-email-functionality]] + * [[Tutorial - Learn-to-figure-out-the-reproduction-steps-for-server-error|Tutorial:-Learn-to-figure-out-the-reproduction-steps-for-server-error]] * Testing * [[Backend tests|Backend-tests]] * [[Frontend tests|Frontend-tests]]