Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add simple tutorial to demonstrate -- read codelist, export to excel,… #221

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
210 changes: 210 additions & 0 deletions doc/source/user_guide/tutorial_edit_existing_variable_codelist.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,210 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "15ad0472",
"metadata": {},
"source": [
"# Tutorial to read variable definition codelist, apply changes in excel and write out again to yaml"
]
},
{
"cell_type": "markdown",
"id": "0fc9ddc0",
"metadata": {},
"source": [
"In this example, we aim to make updates to a `.yaml` variable codelist file - but we want to do them in Excel.\n",
"\n",
"Basic steps:\n",
"1. Export/load the existing `variable` definition codelist from `.yaml` file in the project directory\n",
"2. Write out this codelist to Excel\n",
"3. Apply edits manually in Excel\n",
"4. Read in the Excel and write out to `.yaml` again.\n",
"\n",
"N.B. You will need to have to have latest version of the workflow repository, e.g. github.com/iiasa/xxxx-workflow. \n",
"Navigate to the `definitions` folder, which typically has folders named `variable`, `region` and `scenario`. Launch the Jupyter notebook from the definitions folder. \n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e6a0e25f",
"metadata": {},
"outputs": [],
"source": [
"import nomenclature\n"
]
},
{
"cell_type": "markdown",
"id": "748ac323",
"metadata": {},
"source": [
"## 1. Export/load the existing `variable` definition codelist \n",
"\n",
"Load the definitions from the current directoy (or give the path as argument), \n",
"e.g. 'C:\\\\Github\\\\engage-internal-workflow\\\\definitions'\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "47b62fcf",
"metadata": {},
"outputs": [],
"source": [
"DSD = nomenclature.DataStructureDefinition('.')"
]
},
{
"cell_type": "markdown",
"id": "fbb0a185",
"metadata": {},
"source": [
"## 2. Write out this codelist to Excel"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6fc1988f",
"metadata": {},
"outputs": [],
"source": [
"# Save the variable CodeList to excel (only do this once)\n",
"temp_excel_out = 'temp_variables_excel.xlsx'\n",
"DSD.variable.to_excel(temp_excel_out, sheet_name='variable')"
]
},
{
"cell_type": "markdown",
"id": "2f66438a",
"metadata": {},
"source": [
"## 3. Apply edits manually in Excel\n",
"Make your edits in Excel. \n",
"\n",
"Add/remove variables, improve defintions, specify weights and region-aggregations, etc."
]
},
{
"cell_type": "markdown",
"id": "c28dc29f",
"metadata": {},
"source": [
"## ...."
]
},
{
"cell_type": "markdown",
"id": "e715c46c",
"metadata": {},
"source": [
"## 4. Read in the Excel and write out to `.yaml` again.\n",
"In `attrs`, specify the additional names of the columns (attributes) that are present in the Excel file. You do no need to specify `Variable` column, as that is provided as the `col` in the `create_yaml_from_xlsx` function."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "639b35c4",
"metadata": {},
"outputs": [],
"source": [
"# Load and write out directly to yaml\n",
"temp_excel_out = 'temp_variables_excel.xlsx'\n",
"attrs = ['Unit', 'Skip_region_aggregation', 'Check_aggregate',\n",
" 'Description','Required','Note', 'Region_aggregation', 'Weight', ] \n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e9431c3b",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"id": "7601d725",
"metadata": {},
"outputs": [],
"source": [
"yaml_file_out = 'variable/variables_new.yaml' # Note the name here if you want to be careful about overwriting the previous file.\n",
"nomenclature.create_yaml_from_xlsx(temp_excel_out, yaml_file_out, 'variable', 'Variable', attrs)\n"
]
},
{
"cell_type": "markdown",
"id": "fb95a8a5",
"metadata": {},
"source": [
"## Notes\n",
"- The new `.yaml` codelist is now written out. You can choose to overwrite it directly. \n",
"- When reading in the `DataStructureDefinition` (step 1.), this will automatically parse all available `.yaml` files, so if your new `.yaml` file is also present and you repeat the process, you will likely get a duplication error. \n",
"- New `.yaml` files may come with extra attribute columns, and/or default values (e.g. `skip-aggregation=False`) as new functions and defaults are added to `nomenclature`.\n",
"\n",
" "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "89835034",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"id": "3a8438d9",
"metadata": {},
"outputs": [],
"source": [
"# Check that it loads and validation checks pass again (you'll need to ensure old file is not present)\n",
"DSD1 = nomenclature.DataStructureDefinition('.')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f0160797",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"id": "913675ae",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.15"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
6 changes: 6 additions & 0 deletions doc/source/user_guide/variable.rst
Original file line number Diff line number Diff line change
Expand Up @@ -155,3 +155,9 @@ sum up to the value of the category. The feature uses the **pyam** method
* The method :meth:`DataStructureDefinition.check_aggregate` returns a
:class:`pandas.DataFrame` with a comparison of the original value and the computed
aggregate for all variables that fail the validation.

Editing a CodeList
------------------
A codelist can be edited directly as the `yaml` file, although this may not always be convenient.

Another alternative is to generate an `Excel` version of the codelist, make the necessary edits in Excel, and then process this back into the correctly formatted `yaml` file. to do this, see this tutorial :ref:`tutorial_edit_existing_variable_codelist`.