This page should walk you through how to go from a statistic you want to calculate from Oppia data through creating the models, jobs and getting them up and running. For each step of the process, we've included a reference to an example Pull Request which will provide a good idea of what that step includes.
This document walks through how to create each of these levels.
The recommended way of going through this process is to:
- Plan the overall approach: start at the presentation layer and work your way down the layers to the event log (steps 1 - 4 in the diagram below). This will not involve writing code.
- Write code to record the data you need for your calculations (steps 5-7).
- Use the data in the UI (step 8).
Each of these three sections will be a separate commit in a branch off of develop. After all the steps are completed and reviewed, this branch can be merged into develop.
- Start by thinking about what you want to display. This could mean drawing charts and/or listing out data you want to display.
- List out processed data fields. Figure out which data you need to display and find one field for each. For example, a bulleted list would have one field in the model for each bullet point. A histogram could be drawn by having a list of data point values, but if you want to have standard deviation or mean, you would want to have separate fields for those as well.
- Map out how you would calculate each field in your model. For some fields this may be straightforward, like how counting how many students go to the page for a number of students that started the exploration, whereas others might be more complex and this stage may take a while, like how confused are students.
- Figure out what interactions you would need to keep track of to do those calculations. For mapping out completions of an exploration, this might be “I need to know when students complete the exploration.” Keep in mind when you would record these events. So if you wanted to know an average time spent in a particular state it would become “I need to know how much time was spent in the state when the student leaves the exploration”.
So, you have a model of what you want to create and you know which model it should be in.
-
(Example: #3841) Changes to core.storage.statistics.gae_models.py:
-
Define a class of the form <statistic_name>EventLogEntryModel. Each instance of this model will account for one count of the statistic you want to record. Make sure this model contains all the information you would need to identify the context of an instance of this model (exploration_id, schema_version etc...). Instances of this event model will be used to recompute the statistics aggregated model if there are any data discrepancies.
- Now, add the statistic count as a field of the ‘ExplorationStatsModel’ in the same file. The field should be named <statistic_name>_v2. Also keep in mind to update the corresponding getter methods and also, the save and retrieve methods for this model.
-
-
(Example: #3857) Changes to ‘core.domain.stats_domain.py’:
- Add the statistics field respectively in the ExplorationStats class and similarly modify corresponding helper methods.
- If it is a state level statistic, modify the StateStats class instead of the ExplorationStats class.
-
(Example: #3916) Now, open the file ‘core.controllers.reader.py’.
- First, add validation for this new field in the class ‘StatsEventHandler’. This event handler will be updating the aggregated statistics model.
- Then, create a new EventHandler for the new statistics, of the form ‘<statistics_name>EventHandler’. This will be the event handler that records an instance of each count for the statistic.
-
(Example: #3916) Now, open the file ‘main.py’. Create a route through which the event models will be recorded.
-
(Example: #3916) Open the ‘core.templates.dev.head.pages.exploration-player-page.services.stats-reporting-service.ts’. Add a method to this file for handling any record of the event. The function name can be of the form ‘record’. This function performs two things:
- Make a $http.post() call to the URL we created in step 6 (part 2) with the respective args to record an instance of the event log entry model.
- Increment count for this statistic field in the aggregatedStats dictionary which holds the values for the ExplorationStatsModel. Periodic calls will automatically be sent to the StatsEventHandler automatically to update the ExplorationStatsModel.
-
(Example: #3916) Now, we need to figure out where in the player view recording will be required. For state related stats, we would probably capture our new statistic on entering a state or leaving a state. For answer related stats, you’d probably capture your statistic inside the submitAnswer function in the player view.
Now, your newly added statistic will be available in the ExplorationStatsModel. You should find that the ExplorationStatsModel has already been retrieved in the statistics tab. Visualizing your newly added statistic will be simply using the corresponding field from the stats model in the view.
Well, that’s the gist of it. Have fun recording your stats, and don’t forget to write tests throughout the way.