diff --git a/doc/sphinx/source/recipes/figures/weathertyping/ERA5__lwt_rel_occurrence_1958-2014.png b/doc/sphinx/source/recipes/figures/weathertyping/ERA5__lwt_rel_occurrence_1958-2014.png new file mode 100644 index 0000000000..d42fbf8784 Binary files /dev/null and b/doc/sphinx/source/recipes/figures/weathertyping/ERA5__lwt_rel_occurrence_1958-2014.png differ diff --git a/doc/sphinx/source/recipes/figures/weathertyping/correlation_matrix_E-OBS_1958-2014.png b/doc/sphinx/source/recipes/figures/weathertyping/correlation_matrix_E-OBS_1958-2014.png new file mode 100644 index 0000000000..6288613b4a Binary files /dev/null and b/doc/sphinx/source/recipes/figures/weathertyping/correlation_matrix_E-OBS_1958-2014.png differ diff --git a/doc/sphinx/source/recipes/figures/weathertyping/lwt_1_ERA5__psl_mean_1958-2014.png b/doc/sphinx/source/recipes/figures/weathertyping/lwt_1_ERA5__psl_mean_1958-2014.png new file mode 100644 index 0000000000..b2cba03df6 Binary files /dev/null and b/doc/sphinx/source/recipes/figures/weathertyping/lwt_1_ERA5__psl_mean_1958-2014.png differ diff --git a/doc/sphinx/source/recipes/figures/weathertyping/lwt_1_TaiESM1_r1i1p1f1_psl_mean_1950-2014.png b/doc/sphinx/source/recipes/figures/weathertyping/lwt_1_TaiESM1_r1i1p1f1_psl_mean_1950-2014.png new file mode 100644 index 0000000000..c8f8ad5fda Binary files /dev/null and b/doc/sphinx/source/recipes/figures/weathertyping/lwt_1_TaiESM1_r1i1p1f1_psl_mean_1950-2014.png differ diff --git a/doc/sphinx/source/recipes/figures/weathertyping/slwt_EOBS_4_ERA5__psl_mean_1958-2014.png b/doc/sphinx/source/recipes/figures/weathertyping/slwt_EOBS_4_ERA5__psl_mean_1958-2014.png new file mode 100644 index 0000000000..1f15102db7 Binary files /dev/null and b/doc/sphinx/source/recipes/figures/weathertyping/slwt_EOBS_4_ERA5__psl_mean_1958-2014.png differ diff --git a/doc/sphinx/source/recipes/index.rst b/doc/sphinx/source/recipes/index.rst index e18ada0fd7..73ff540609 100644 --- a/doc/sphinx/source/recipes/index.rst +++ b/doc/sphinx/source/recipes/index.rst @@ -64,6 +64,7 @@ Atmosphere recipe_validation recipe_radiation_budget recipe_aod_aeronet_assess + recipe_weathertyping Climate metrics ^^^^^^^^^^^^^^^ diff --git a/doc/sphinx/source/recipes/recipe_weathertyping.rst b/doc/sphinx/source/recipes/recipe_weathertyping.rst new file mode 100644 index 0000000000..e49889f540 --- /dev/null +++ b/doc/sphinx/source/recipes/recipe_weathertyping.rst @@ -0,0 +1,113 @@ +.. _recipes_weathertyping: + +Lamb Weathertypes +===== + +Overview +-------- + +A diagnostic to calculate Lamb weathertypes over a given region. Furthermore, +correlations between weathertypes and precipitation patterns over a given area can be calculated +and 'combined' or 'simplified' weathertypes can be derived. Additionally, mean fields, as well as +anomalies and standard deviations can be plotted. + + +Available recipes and diagnostics +--------------------------------- + +Recipes are stored in esmvaltool/recipes/ + + * recipe_weathertyping.yml + +Diagnostics are stored in esmvaltool/diag_scripts/weathertyping/ + + * weathertyping.py: calculate lamb and simplified WT, plot mean, anomalies and std for each WT for psl, tas, pr + + +User settings in recipe +----------------------- + +#. weathertyping.py + + *Required settings for script* + + *Optional settings for script* + + * correlation_threshold: correlation threshold for selecting similar WT pairs, only needed if automatic_slwt==True and predefined_slwt==False. default=0.9 + * rmse_threshold: rmse threshold for selecting similar WT pairs, only needed if automatic_slwt==True and predefined_slwt==False. default=0.002 + * plotting: if true, create plots of means, anomalies and std for psl, tas, prcp + * automatic_slwt: if true, automatically combine WT with similar precipitation patterns over specified area (via thresholds of correlation and rmse OR via predefined_slwt) + * predefined_slwt: dictionary of mappings between weathertypes + +.. note:: + + predefined_slwt can be a dictionary where keys are slwt and the values are arrays of lwt OR where keys are lwt and values are slwt + + *Required settings for variables* + + *Optional settings for variables* + + *Required settings for preprocessor* + + *Optional settings for preprocessor* + + *Color tables* + + +Variables +--------- + +* psl (atmos, day, time longitude latitude) +* tas (atmos, day, time longitude latitude) +* tp (atmos, day, time longitude latitude) +* pr (atmos, day, time longitude latitude) + + +Observations and reformat scripts +--------------------------------- + +*Note: (1) obs4MIPs data can be used directly without any preprocessing; +(2) see headers of reformat scripts for non-obs4MIPs data for download +instructions.* + +* E-OBS: European Climate Assessment & Dataset gridded daily precipitation sum +* ERA5: ECMWF reanalysis + +References +---------- + +* Maraun, D., Truhetz, H., & Schaffer, A. (2021). Regional climate model biases, their dependence on synoptic circulation biases and the potential for bias adjustment: A process-oriented evaluation of the Austrian regional climate projections. Journal of Geophysical Research: Atmospheres, 126, e2020JD032824. https://doi.org/10.1029/2020JD032824 +* Jones, P.D., Hulme, M. and Briffa, K.R. (1993), A comparison of Lamb circulation types with an objective classification scheme. Int. J. Climatol., 13: 655-663. https://doi.org/10.1002/joc.3370130606 + +Example plots +------------- + +.. _fig_weathertyping_1: +.. figure:: /recipes/figures/weathertyping/lwt_1_ERA5__psl_mean_1958-2014.png + :align: center + + PSL mean map of Lamb WT 1 for ERA5. + +.. _fig_weathertyping_2: +.. figure:: /recipes/figures/weathertyping/lwt_1_TaiESM1_r1i1p1f1_psl_mean_1958-2014.png + :align: center + + PSL mean map of Lamb WT 1 for TaiESM1. + +.. _fig_weathertyping_3: +.. figure:: /recipes/figures/weathertyping/slwt_EOBS_4_ERA5__psl_mean_1958-2014.png + :align: center + + PSL mean map of slwt_EOBS 4 for ERA5 (in this case combined Lamb WT 24 and 23). + +.. _fig_weathertyping_4: +.. figure:: /recipes/figures/weathertyping/correlation_matrix_E-OBS_1958-2014.png + :align: center + + Heatmap of correlation values for Lamb WTs 1-27. + +.. _fig_weathertyping_5: +.. figure:: /recipes/figures/weathertyping/ERA5__lwt_rel_occurrence_1958-2014.png + :align: center + + Stackplot of seasonal relative occureences of each WT for ERA5. diff --git a/esmvaltool/config-references.yml b/esmvaltool/config-references.yml index 199dc671e0..2cb8e402d1 100644 --- a/esmvaltool/config-references.yml +++ b/esmvaltool/config-references.yml @@ -286,6 +286,11 @@ authors: name: Juckes, Martin institute: BADC, UK orcid: + jury_martin: + name: Jury, Martin + institute: WEGC, Austria + e-mail: martin.jury@uni-graz.at + orcid: https://orcid.org/0000-0003-0590-7843 kadygrov_nikolay: name: Kadygrov, Nikolay institute: IPSL, France @@ -315,6 +320,12 @@ authors: name: Krasting, John institute: NOAA, USA orcid: https://orcid.org/0000-0002-4650-9844 + kroissenbrunner_thomas: + name: Kroissenbrunner, Thomas + institute: WEGC, Austria + email: thomas.kroissenbrunner@uni-graz.at + github: thomaskroi1996 + orcid: kuehbacher_birgit: name: Kuehbacher, Birgit institute: DLR, Germany @@ -793,6 +804,7 @@ projects: trr181: DFG Project TRR-181, Energy transfers in Atmosphere and Ocean ukesm: UKESM, UK Earth System Model project (NERC) usmile: ERC Synergy Grant USMILE + preval: PREVAL ÖKS realms: diff --git a/esmvaltool/diag_scripts/climate_metrics/feedback_parameters.py b/esmvaltool/diag_scripts/climate_metrics/feedback_parameters.py index db350982a2..d6bd28b0fb 100644 --- a/esmvaltool/diag_scripts/climate_metrics/feedback_parameters.py +++ b/esmvaltool/diag_scripts/climate_metrics/feedback_parameters.py @@ -365,7 +365,7 @@ def _create_regression_plot(tas_cube, y_reg = reg.slope * x_reg + reg.intercept # Plot data - title = (f'{FEEDBACK_PARAMETERS.get(var,var)} TOA radiance for ' + title = (f'{FEEDBACK_PARAMETERS.get(var, var)} TOA radiance for ' f'{dataset_name}') filename = f'{var}_regression_{dataset_name}' if description is not None: diff --git a/esmvaltool/diag_scripts/weathertyping/weathertyping.py b/esmvaltool/diag_scripts/weathertyping/weathertyping.py new file mode 100644 index 0000000000..8997537bb8 --- /dev/null +++ b/esmvaltool/diag_scripts/weathertyping/weathertyping.py @@ -0,0 +1,260 @@ +""" This script calculates weathertypes for the input datasets and writes \ + to file. It also plots the means and seasonal occurrence of the + weathertypes, and offers the option to calculate simplified + weathertypes based on precipitation patterns.""" +import iris +from wt_utils import get_cfg_vars, load_wt_preprocessors, wt_algorithm, \ + get_ancestors_era5_eobs, calc_slwt_obs, run_predefined_slwt, \ + combine_wt_to_file, load_wt_files, get_looping_dict, plot_means, \ + plot_seasonal_occurrence, write_lwt_to_file, get_model_output_filepath, \ + calc_lwt_slwt_model, calc_lwt_model + +# import internal esmvaltool modules here +from esmvaltool.diag_scripts.shared import run_diagnostic + + +def run_automatic_slwt(cfg: dict): + """Run the automated calculation for simplified weathertypes \ + and write to file, and plot the means and seasonal occurrence \ + of the weathertypes. + + Args: + cfg (dict): Nested dictionary of metadata + """ + preproc_variables_dict, _, _, \ + work_dir, plotting, _, predefined_slwt = get_cfg_vars(cfg) + for dataset_name, dataset_vars in preproc_variables_dict.items(): + timerange = dataset_vars[0].get('timerange').replace('/', '-') + if dataset_name == 'ERA5': + wt_preproc, wt_preproc_prcp, wt_preproc_prcp_eobs = \ + load_wt_preprocessors(dataset_name, preproc_variables_dict) + + # calculate lwt + lwt = wt_algorithm(wt_preproc, dataset_name) + + era5_ancestors, eobs_ancestors = get_ancestors_era5_eobs( + dataset_name, preproc_variables_dict) + + # calculate simplified lwt based on precipitation + # patterns or use predefined_slwt + if predefined_slwt is False: + slwt_era5 = calc_slwt_obs( + cfg, + lwt, + wt_preproc_prcp, + dataset_name, + era5_ancestors, + timerange + ) + slwt_eobs = calc_slwt_obs( + cfg, + lwt, + wt_preproc_prcp_eobs, + 'E-OBS', + eobs_ancestors, + timerange + ) + else: + slwt_era5, slwt_eobs = run_predefined_slwt(work_dir, + dataset_name, + lwt, + predefined_slwt) + + # write to file + wt_list = [lwt, slwt_era5, slwt_eobs] + combine_wt_to_file(cfg, wt_list, wt_preproc, + dataset_name) + + # load weathertype files as cubes + wt_cubes = load_wt_files(f'{work_dir}/{dataset_name}.nc') + + var_dict = get_looping_dict( + dataset_vars + ) # dataset_vars is list of variables for dataset dataset_name + if plotting: + # plot means + for var_name, var_data in var_dict.items(): + data_info = {'dataset': dataset_name, + 'var': var_name, + 'preproc_path': var_data[1], + 'ensemble': + dataset_vars[0].get('ensemble', ''), + 'timerange': timerange} + plot_means(cfg, var_data[0], wt_cubes, data_info) + plot_seasonal_occurrence(cfg, wt_cubes, dataset_name, + timerange) + else: + if dataset_name == 'E-OBS': + continue + wt_preproc = iris.load_cube(dataset_vars[0].get('filename')) + + output_file_path, preproc_path = get_model_output_filepath( + dataset_name, dataset_vars) + # calculate weathertypes + data_info = {'dataset': dataset_name, + 'preproc_path': preproc_path, + 'output_file_path': output_file_path, + 'ensemble': dataset_vars[0].get('ensemble', ''), + 'timerange': timerange} + calc_lwt_slwt_model(cfg, wt_preproc, data_info, predefined_slwt) + + # load wt files + wt_cubes = load_wt_files( + f'{work_dir}/{output_file_path}/{dataset_name}.nc') + + var_dict = get_looping_dict( + dataset_vars + ) # dataset_vars is list of variables for dataset dataset_name + + # plot means + if plotting: + for var_name, var_data in var_dict.items(): + data_info = {'dataset': dataset_name, + 'var': var_name, + 'preproc_path': var_data[1], + 'ensemble': dataset_vars[0].get( + 'ensemble', ''), + 'timerange': timerange} + plot_means(cfg, var_data[0], wt_cubes, data_info, + only_lwt=True) + plot_seasonal_occurrence(cfg, wt_cubes, dataset_name, + timerange, + ensemble=dataset_vars[0]. + get('ensemble', '')) + + +def run_lwt(cfg: dict): + """Run calculation of weathertypes and write to file, and plot the means \ + and seasonal occurrence of the weathertypes. + + Args: + cfg (dict): Nested dictionary of metadata + """ + preproc_variables_dict, _, _, \ + work_dir, plotting, _, _ = get_cfg_vars(cfg) + for dataset_name, dataset_vars in preproc_variables_dict.items(): + timerange = dataset_vars[0].get('timerange').replace('/', '-') + if dataset_name == 'ERA5': + wt_preproc, wt_preproc_prcp, wt_preproc_prcp_eobs = \ + load_wt_preprocessors(dataset_name, preproc_variables_dict) + + # calculate lwt + lwt = wt_algorithm(wt_preproc, dataset_name) + + era5_ancestors, eobs_ancestors = get_ancestors_era5_eobs( + dataset_name, preproc_variables_dict) + + # calculate simplified lwt based on precipitation patterns + _ = calc_slwt_obs( + cfg, + lwt, + wt_preproc_prcp, + dataset_name, + era5_ancestors, + timerange + ) + _ = calc_slwt_obs( + cfg, + lwt, + wt_preproc_prcp_eobs, + 'E-OBS', + eobs_ancestors, + timerange + ) + + # write only lwt to file + write_lwt_to_file(cfg, lwt, wt_preproc, dataset_name) + + # load wt files + wt_cubes = load_wt_files(f'{work_dir}/{dataset_name}.nc', + only_lwt=True) + + var_dict = get_looping_dict( + dataset_vars + ) # dataset_vars is list of variables for dataset dataset_name + + if plotting: + # plot means + timerange = dataset_vars[0].get('timerange').replace('/', '-') + for var_name, var_data in var_dict.items(): + data_info = {'dataset': dataset_name, + 'var': var_name, + 'preproc_path': var_data[1], + 'ensemble': dataset_vars[0].get( + 'ensemble', ''), + 'timerange': timerange} + plot_means(cfg, var_data[0], wt_cubes, data_info, + only_lwt=True) + plot_seasonal_occurrence(cfg, wt_cubes, dataset_name, + timerange) + else: + if dataset_name == 'E-OBS': + continue + wt_preproc = iris.load_cube(dataset_vars[0].get('filename')) + + output_file_path = get_model_output_filepath( + dataset_name, dataset_vars)[1] + + # calculate weathertypes + data_info = {'output_file_path': output_file_path, + 'ensemble': dataset_vars[0].get('ensemble', ''), + 'timerange': timerange} + calc_lwt_model(cfg, wt_preproc, dataset_name, data_info) + + # load wt files + wt_cubes = load_wt_files(f'{work_dir}/{dataset_name}.nc', + only_lwt=True) + + var_dict = get_looping_dict( + dataset_vars + ) # dataset_vars is list of variables for dataset dataset_name + + if plotting: + # plot means + timerange = dataset_vars[0].get('timerange').replace('/', '-') + ensemble = dataset_vars[0].get('ensemble') + for var_name, var_data in var_dict.items(): + data_info = {'dataset': dataset_name, + 'var': var_name, + 'preproc_path': var_data[1], + 'ensemble': dataset_vars[0].get( + 'ensemble', ''), + 'timerange': timerange} + plot_means(cfg, var_data[0], wt_cubes, data_info, + only_lwt=True) + plot_seasonal_occurrence(cfg, wt_cubes, dataset_name, + timerange, + ensemble=ensemble) + + +def run_my_diagnostic(cfg: dict): + ''' + Arguments: + cfg - nested dictionary of metadata + + Returns: + string; runs the user diagnostic + + ''' + # assemble the data dictionary keyed by dataset name + # this makes use of the handy group_metadata function that + # orders the data by 'dataset'; the resulting dictionary is + # keyed on datasets e.g. dict = {'MPI-ESM-LR': [var1, var2...]} + # where var1, var2 are dicts holding all needed information per variable + automatic_slwt = cfg.get('automatic_slwt') + + if automatic_slwt: + run_automatic_slwt(cfg) + # if automatic_slwt is false, and predefined_slwt is false, + # just write selected pairs to file + elif not automatic_slwt: + run_lwt(cfg) + + +if __name__ == '__main__': + + # always use run_diagnostic() to get the config (the preprocessor + # nested dictionary holding all the needed information) + with run_diagnostic() as config: + # list here the functions that need to run + run_my_diagnostic(config) diff --git a/esmvaltool/diag_scripts/weathertyping/wt_utils.py b/esmvaltool/diag_scripts/weathertyping/wt_utils.py new file mode 100644 index 0000000000..37b38cfae1 --- /dev/null +++ b/esmvaltool/diag_scripts/weathertyping/wt_utils.py @@ -0,0 +1,1744 @@ +# weathertyping diagnostic for esmvaltool + +# operating system manipulations (e.g. path constructions) +import json +import logging +import os +import warnings + +# plotting imports +import cartopy.crs as ccrs +import cartopy.feature as cfeature +# to manipulate iris cubes +import iris +import iris.analysis.cartography +import iris.plot as iplt +import matplotlib.pyplot as plt +import matplotlib.ticker as mticker +# general imports +import numpy as np +import pandas as pd +import seaborn as sns +from cartopy.mpl.gridliner import LATITUDE_FORMATTER, LONGITUDE_FORMATTER +from matplotlib.colors import ListedColormap + +# import internal esmvaltool modules here +from esmvaltool.diag_scripts.shared import ProvenanceLogger, group_metadata + +iris.FUTURE.datum_support = True + +logger = logging.getLogger(os.path.basename(__file__)) + +# Ignoring a warning that is produced when selecting timesteps of a weathertype +warnings.filterwarnings('ignore', '.*Collapsing a non-contiguous coordinate*') + + +def generate_grayscale_hex_values(x): + """Generate grayscale values for plotting seasonal occurrence. + + Args: + x (int): number of weathertypes + + Returns: + np.list: array with grescale values as hex + """ + grayscale_values = np.linspace(0, 1, x) + grayscale_hex = [ + f'#{int(value * 255):02x}{int(value * 255):02x}{int(value * 255):02x}' + for value in grayscale_values] + + return grayscale_hex + + +def plot_seasonal_occurrence(cfg: dict, wt_cubes: iris.cube.Cube, + dataset_name: str, timerange: str, + ensemble: str = ""): + """Plot relative monthly/seasonal occurrence of weathertypes. + + Args: + cfg (dict): Configuration dict from recipe + wt_cubes (iris.cube.Cube): Cube with weathertypes + dataset_name (str): name of dataset + """ + output_path = f'{cfg["plot_dir"]}/seasonal_occurrence' + + if not os.path.exists(f'{output_path}'): + os.makedirs(f'{output_path}') + + month_list = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', + 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'] + + relative_occurrences = {} + # {wt_string: {month: {wt1: rel_occurrence, wt2: rel_occurrence, ....}}} + # first do absolute occurrence, then relative occurrence + for cube in wt_cubes: + month_dict = {} + for month in range(1, 13): + month_constraint = iris.Constraint( + time=iris.time.PartialDateTime(month=month)) + array = cube.extract(month_constraint).data + unique, counts = np.unique(array, return_counts=True) + count_dict = dict(zip(unique, counts/sum(counts))) + month_dict[month] = count_dict + relative_occurrences[cube.long_name] = month_dict + + x = month_list + + for wt_string in relative_occurrences: + wt_numbers = max(len(value) for value in + relative_occurrences.get(wt_string).values()) + colors = generate_grayscale_hex_values(wt_numbers) + wt_stack = np.zeros((wt_numbers, 12)) + for month, month_value in relative_occurrences.get(wt_string).items(): + print(month_value) + for wt in month_value.keys(): + print(month_value.get(wt)) + wt_stack[np.int8(wt-1), month-1] = month_value.get(wt) + + y = np.vstack(list(wt_stack)) + + # plot + _, ax_ = plt.subplots(figsize=(10, 10)) + + ax_.set_title(f'Seasonal occurence of {wt_string}, \ + {dataset_name}, {timerange}') + + ax_.stackplot(x, y, colors=colors) + + ax_.legend(loc='upper center', bbox_to_anchor=(0.5, -0.05), + fancybox=True, shadow=True, ncol=9, + labels=tuple(f'WT {i+1}' for i in range(0, wt_numbers))) + + ax_.set(xlim=(0, 11), xticks=np.arange(0, 12), + ylim=(0, 1), yticks=np.arange(0, 1.1, 0.1)) + ax_.set_xlabel('Month') + ax_.set_ylabel('Cumulative Relative occurrence') + + plt.savefig(f'{output_path}/{dataset_name}_{ensemble}_' + f'{wt_string}_rel_occurrence_{timerange}.png') + plt.savefig(f'{output_path}/{dataset_name}_{ensemble}_' + f'{wt_string}_rel_occurrence_{timerange}.pdf') + plt.close() + + +def get_cfg_vars(cfg: dict): + """Get list of vars from configuration dict. + + Args: + cfg (dict): Configuration dict from recipe. + + Returns: + tuple: cfg vars + """ + preproc_variables_dict = group_metadata( + cfg.get('input_data').values(), 'dataset') + + correlation_threshold = cfg.get('correlation_threshold') + rmse_threshold = cfg.get('rmse_threshold') + work_dir = cfg.get('work_dir') + plotting = cfg.get('plotting', False) + automatic_slwt = cfg.get('automatic_slwt') + predefined_slwt = cfg.get('predefined_slwt', False) + + return (preproc_variables_dict, correlation_threshold, rmse_threshold, + work_dir, plotting, automatic_slwt, predefined_slwt) + + +def load_wt_preprocessors(dataset: str, preproc_variables_dict: dict): + """Load preprocessor cubes for calculating Lamb and simplified + weathertypes. + + Args: + dataset (str): dataset name + preproc_variables_dict (dict): dictionary of preprocessor variables + + Returns: + _type_: list of preprocessor vars for weathertyping + """ + wt_preproc = iris.load_cube( + preproc_variables_dict.get(dataset)[0].get('filename')) + wt_preproc_prcp = iris.load_cube( + preproc_variables_dict.get(dataset)[1].get('filename')) + wt_preproc_prcp_eobs = iris.load_cube( + preproc_variables_dict.get('E-OBS')[0].get('filename')) + + return wt_preproc, wt_preproc_prcp, wt_preproc_prcp_eobs + + +def get_ancestors_era5_eobs(dataset: str, preproc_variables_dict: dict): + """Get ancestors for ERA5/E-OBS. + + Args: + dataset (str): dataset name + preproc_variables_dict (dict): dictionary of preprocessor variables + + Returns: + _type_: lists of ERA5/E-OBS ancestors + """ + era5_ancestors = [ + preproc_variables_dict.get(dataset)[0].get('filename'), + preproc_variables_dict.get(dataset)[1].get('filename') + ] + eobs_ancestors = [ + preproc_variables_dict.get(dataset)[0].get('filename'), + preproc_variables_dict.get('E-OBS')[0].get('filename') + ] + return era5_ancestors, eobs_ancestors + + +def get_model_output_filepath(dataset: str, value: list): + """Generate output filepath for model data. + + Args: + dataset (str): Model name + value (list): Model variables + + Returns: + _type_: Output filepath and preprocessor path for + future referencing. + """ + model_name = dataset + timerange = value[0].get('timerange').replace('/', '-') + experiment = value[0].get('exp') + ensemble = value[0].get('ensemble') + + output_file_path = f'{model_name}/{experiment}/{ensemble}/{timerange}' + preproc_path = value[0].get('filename') + + return output_file_path, preproc_path + + +def get_preproc_lists(preproc_vars: list): + """Put preprocessor variables and paths into list for further use. + + Args: + preproc_vars (list): List of variables for specific + dataset. + + Returns: + _type_: List of preprocessor cubes for mean calculations + as well as path to files. + """ + preproc_path_psl = preproc_vars[-3].get('filename') + preproc_path_prcp = preproc_vars[-2].get('filename') + preproc_path_tas = preproc_vars[-1].get('filename') + + mean_preproc_psl = iris.load_cube(preproc_vars[-3].get('filename')) + mean_preproc_prcp = iris.load_cube(preproc_vars[-2].get('filename')) + mean_preproc_tas = iris.load_cube(preproc_vars[-1].get('filename')) + + preproc_list = [mean_preproc_psl, mean_preproc_prcp, mean_preproc_tas] + preproc_path_list = [preproc_path_psl, preproc_path_prcp, preproc_path_tas] + + return preproc_list, preproc_path_list + + +def get_looping_dict(preproc_vars: list): + """Put cubes into dictionary for further use. + + Args: + preproc_vars (list): Values of dataset dictionary + + Returns: + _type_: Dictionary of the form {'var': [preprov_var, preproc_path]} + """ + preproc, preproc_path = get_preproc_lists(preproc_vars) + dict_ = { + 'psl': [preproc[0], preproc_path[0]], + 'pr': [preproc[1], preproc_path[1]], + 'tas': [preproc[2], preproc_path[2]] + } + return dict_ + + +def load_wt_files(path: str, only_lwt=False): + """Load *.nc files of weathertype data. If only_lwt is true, only Lamb + weathertypes will be loaded. (useful for automatic_slwt = False) + + Args: + path (str): Path to weathertype data. + only_lwt (bool, optional): If True, + only Lamb weathertypes will be loaded. Defaults to False. + (useful for automatic_slwt = False) + + Returns: + _type_: List of weathertype cubes. + """ + if not only_lwt: + lwt_cube = iris.load_cube(path, 'lwt') + slwt_era5_cube = iris.load_cube(path, 'slwt_era5') + slwt_eobs_cube = iris.load_cube(path, 'slwt_eobs') + wt_cubes = [lwt_cube, slwt_era5_cube, slwt_eobs_cube] + else: + lwt_cube = iris.load_cube(path, 'lwt') + wt_cubes = [lwt_cube] + + return wt_cubes + + +def plot_means(cfg: dict, + preproc_var: np.array, + wt_cubes: iris.cube.Cube, + data_info: dict, + only_lwt=False): + """Wrapper function to plot various means/std/anomalies. + + Args: + cfg (dict): cfg dictionary provided by recipe + preproc_var (np.array): variable to be plotted + wt_cubes (iris.cube.Cube): list of wt cubes + data_info (dict): dictionary with info to dataset + only_lwt (bool, optional): If True, + only Lamb weathertypes will be loaded. Defaults to False. + (useful for automatic_slwt = False) + """ + + if not only_lwt: + data_info['wt_string'] = 'lwt' + calc_wt_means( + cfg, + preproc_var, + wt_cubes, + data_info + ) + data_info['wt_string'] = 'slwt_ERA5' + calc_wt_means( + cfg, + preproc_var, + wt_cubes, + data_info + ) + data_info['wt_string'] = 'slwt_EOBS' + calc_wt_means( + cfg, + preproc_var, + wt_cubes, + data_info + ) + else: + data_info['wt_string'] = 'lwt' + calc_wt_means( + cfg, + preproc_var, + wt_cubes, + data_info + ) + + +def get_provenance_record(caption: str, ancestors: list, long_names: list, + plot_types: bool | list, + statistics: bool | list) -> dict: + """Get provenance record. + + Args: + caption (str): Caption for plots + ancestors (list): List of ancestors + long_names (list): Variable long names + plot_types (bool | list): Type of plot. Can be false + if output is not a plot. + statistics (bool | list): Types of statistics used. + + Returns: + dict: Provenance dictionary. + """ + record = { + 'caption': caption, + 'domains': ['reg'], + 'authors': ['jury_martin', 'kroissenbrunner_thomas'], + 'references': ['maraun21jgr', 'jones93ijc'], + 'projects': ['preval'], + 'long_names': long_names, + 'ancestors': ancestors, + } + if plot_types is not False: + record['plot_types'] = plot_types + if statistics is not False: + record['statistics'] = statistics + return record + + +def log_provenance(filename: str, cfg: dict, provenance_record: dict): + """Log provenance. Produces xml file provenance info. + + Args: + caption (str): Caption of plots. + filename (str): Output name of provenance. + cfg (dict): Configuration dictionary provided by recipe. + ancestors (list): List of ancestors. + long_names (list): Variable long_names + plot_types (bool | list): Plot types. Can be false + if output is not a plot. + statistics (bool | list): Statistics used. + """ + + with ProvenanceLogger(cfg) as provenance_logger: + provenance_logger.log(filename, provenance_record) + + logger.info('Output stored as %s', filename) + + +def turn_list_to_mapping_dict(list_: list) -> dict: + """Turns a list of combined WT to a dictionary for further processing. + + Args: + list_ (list): List where entries are lists with related WT + + Returns: + dict: Mapping dicitonary keys are simplified WT, values are Lamb WT + """ + result_dict = {} + + for i, s in enumerate(list_): + for elem in s: + if elem not in result_dict: + result_dict[elem] = i + 1 + else: + result_dict[elem].append(i) + + return result_dict + + +def get_mapping_dict(selected_pairs: list) -> dict: + """Get mapping dictionary from list of selected pairs. + + Args: + selected_pairs (list): Selected pairs of WTs based on + precipitation patterns over specified area and + correlation and RSME thresholds defined in recipe.S + + Returns: + dict: Mapping dicitonary keys are simplified WT, values are Lamb WT + """ + mapping_array = [] + + for i, elem in enumerate(selected_pairs): + mapping_array.append(elem[0]) + + s = [set(i) for i in mapping_array if i] + + def find_intersection(m_list: list) -> list: + for i, v in enumerate(m_list): + for j, k in enumerate(m_list[i + 1:], i + 1): + if v & k: + s[i] = v.union(m_list.pop(j)) + return find_intersection(m_list) + return m_list + + merged_tuples = find_intersection(s) + mapping_dict = turn_list_to_mapping_dict(merged_tuples) + + return mapping_dict + + +def write_mapping_dict(work_dir: str, dataset: str, mapping_dict: dict): + """Write mapping dictionary to file. + + Args: + work_dir (str): Working directory + dataset (str): Name of dataset + mapping_dict (dict): Mapping dictionary + """ + mapping_dict_reformat = convert_dict(mapping_dict) + + with open(f'{work_dir}/wt_mapping_dict_{dataset}.json', + 'w', + encoding='utf-8') as file: + json.dump(mapping_dict_reformat, file) + + +def calc_slwt_obs(cfg: dict, lwt: np.array, cube: iris.cube.Cube, + dataset: str, ancestors: list, timerange: str) -> np.array: + """Calculate simplified weathertypes for observation datasets based on \ + precipitation patterns over specified area. + + Args: + cfg (dict): Configuration dictionary from recipe + lwt (np.array): Array of Lamb WT + cube (iris.cube.Cube): preprocessor cube to keep time coordinate + dataset (str): Name of dataset + correlation_thresold (float): correlation_threshold + rmse_threshold (float): rsme_threshold + ancestors (list): list of ancestors + + Returns: + np.array: _description_ + """ + + logger.info('Calculating simplified Lamb Weathertypes for %s', dataset) + + work_dir = cfg.get('work_dir') + correlation_threshold = cfg.get('correlation_threshold') + rmse_threshold = cfg.get('rmse_threshold') + tcoord = cube.coord('time') + + wt_data_prcp = [] + for wt_ in range(1, 28): + target_indices = np.where(lwt == wt_) + if len(target_indices[0]) < 1: + logger.info( + 'calc_slwt_obs - CAUTION: Skipped wt %s \ + for dataset %s!', wt_, dataset) + continue + dates = [ + tcoord.units.num2date(tcoord.points[i]) for i in target_indices + ] + if dataset == 'E-OBS': + extracted_cube = cube[target_indices] + else: + extracted_cube = cube.extract( + iris.Constraint(time=lambda t: t.point in dates[0])) + wt_cube_mean = extracted_cube.collapsed('time', iris.analysis.MEAN) + wt_data_prcp.append(wt_cube_mean.data.compressed()) + selected_pairs = process_prcp_mean(cfg, wt_data_prcp, + correlation_threshold, + rmse_threshold, dataset, timerange) + + with open(f'{work_dir}/wt_selected_pairs_{dataset}.json', + 'w', + encoding='utf-8') as file: + json.dump(selected_pairs, file) + + mapping_dict = get_mapping_dict(selected_pairs) + + write_mapping_dict(work_dir, dataset, mapping_dict) + + provenance_record = get_provenance_record('Lamb Weathertypes', + ancestors, + ['Lamb Weathertypes'], + False, False) + + log_provenance(f'{work_dir}/wt_selected_pairs_{dataset}', + cfg, provenance_record) + + return map_lwt_to_slwt(lwt, mapping_dict) + + +def calc_const(): + """Calculate constants for weathertyping algorithm. + Eq. taken from: Jones, P.D., Hulme, M. and Briffa, K.R. (1993), + A comparison of Lamb circulation types with an objective classification + scheme. + Int. J. Climatol., 13: 655-663. https://doi.org/10.1002/joc.3370130606 + + Returns: + tuple: The four constants needed for WT calculation. + """ + + const1 = 1 / np.cos(45 * np.pi / 180) + const2 = np.sin(45 * np.pi / 180) / np.sin(40 * np.pi / 180) + const3 = np.sin(45 * np.pi / 180) / np.sin(50 * np.pi / 180) + const4 = 1 / (2 * np.cos(45 * np.pi / 180)**2) + + return const1, const2, const3, const4 + + +def calc_westerly_flow(cube: iris.cube.Cube) -> np.array: + """Calculate the westerly flow over area. + Eq. taken from: Jones, P.D., Hulme, M. and Briffa, K.R. (1993), + A comparison of Lamb circulation types with an objective classification + scheme. + Int. J. Climatol., 13: 655-663. https://doi.org/10.1002/joc.3370130606 + + Args: + cube (iris.cube.Cube): Cube of psl data. + + Returns: + np.array: westerly flow + """ + + return 1 / 2 * (cube.data[:, 1, 2] + cube.data[:, 1, 4]) - 1 / 2 * ( + cube.data[:, 3, 2] + cube.data[:, 3, 4]) + + +def calc_southerly_flow(cube: iris.cube.Cube, const1: float) -> np.array: + """Calculate the southerly flow over area. + Eq. taken from: Jones, P.D., Hulme, M. and Briffa, K.R. (1993), + A comparison of Lamb circulation types with an objective classification + scheme. + Int. J. Climatol., 13: 655-663. https://doi.org/10.1002/joc.3370130606 + + Args: + cube (iris.cube.Cube): Cube of psl data. + const1 (float): const1 + + Returns: + np.array: southerly flow + """ + + return const1 * ( + 1 / 4 * + (cube.data[:, 3, 4] + 2 * cube.data[:, 2, 4] + cube.data[:, 1, 4]) - + 1 / 4 * + (cube.data[:, 3, 2] + 2 * cube.data[:, 2, 2] + cube.data[:, 1, 2])) + + +def calc_resultant_flow(westerly_flow: np.array, + southerly_flow: np.array) -> np.array: + """Calculate the resultant flow. + Eq. taken from: Jones, P.D., Hulme, M. and Briffa, K.R. (1993), + A comparison of Lamb circulation types with an objective classification + scheme. + Int. J. Climatol., 13: 655-663. https://doi.org/10.1002/joc.3370130606 + + Args: + westerly_flow (np.array): westerly flow. + southerly_flow (np.array): southerly flow + + Returns: + np.array: resultant flow + """ + return (southerly_flow**2 + westerly_flow**2)**(1 / 2) + + +def calc_westerly_shear_velocity(cube: iris.cube.Cube, const2: float, + const3: float) -> np.array: + """Calculate westerly shear velocity. + Eq. taken from: Jones, P.D., Hulme, M. and Briffa, K.R. (1993), + A comparison of Lamb circulation types with an objective classification + scheme. + Int. J. Climatol., 13: 655-663. https://doi.org/10.1002/joc.3370130606 + + Args: + cube (iris.cube.Cube): cube of psl data + const2 (float): const2 + const3 (float): const3 + + Returns: + np.array: westerly shear velocity + """ + return const2 * (1 / 2 * + (cube.data[:, 0, 2] + cube.data[:, 0, 4]) - 1 / 2 * + (cube.data[:, 2, 2] + cube.data[:, 2, 4])) - const3 * ( + 1 / 2 * + (cube.data[:, 2, 2] + cube.data[:, 2, 4]) - 1 / 2 * + (cube.data[:, 4, 2] + cube.data[:, 4, 4])) + + +def calc_southerly_shear_velocity(cube: iris.cube.Cube, + const4: float) -> np.array: + """Calculate southerly shear velocity. + Eq. taken from: Jones, P.D., Hulme, M. and Briffa, K.R. (1993), + A comparison of Lamb circulation types with an objective classification + scheme. + Int. J. Climatol., 13: 655-663. https://doi.org/10.1002/joc.3370130606 + + Args: + cube (iris.cube.Cube): cube of psl data + const4 (float): const4 + + Returns: + np.array: southerly shear velocity + """ + return const4 * ( + 1 / 4 * + (cube.data[:, 3, 6] + 2 * cube.data[:, 2, 6] + cube.data[:, 1, 6]) - + 1 / 4 * + (cube.data[:, 3, 4] + 2 * cube.data[:, 2, 4] + cube.data[:, 1, 4]) - + 1 / 4 * + (cube.data[:, 3, 2] + 2 * cube.data[:, 2, 2] + cube.data[:, 1, 2]) + + 1 / 4 * + (cube.data[:, 3, 0] + 2 * cube.data[:, 2, 0] + cube.data[:, 1, 0])) + + +def calc_total_shear_velocity(westerly_shear_velocity: np.array, + southerly_shear_velocity: np.array) -> np.array: + """Calculate total shear velocity. + Eq. taken from: Jones, P.D., Hulme, M. and Briffa, K.R. (1993), + A comparison of Lamb circulation types with an objective classification + scheme. + Int. J. Climatol., 13: 655-663. https://doi.org/10.1002/joc.3370130606 + + Args: + westerly_shear_velocity (np.array): westerly shear velocity + southerly_shear_velocity (np.array): southerly shear velocity + + Returns: + np.array: total shear velocity + """ + return westerly_shear_velocity + southerly_shear_velocity + + +def wt_algorithm(cube: iris.cube.Cube, dataset: str) -> np.array: + """Algorithm to calculate Lamb weathertypes. + Eq. taken from: Jones, P.D., Hulme, M. and Briffa, K.R. (1993), + A comparison of Lamb circulation types with an objective classification + scheme. + Int. J. Climatol., 13: 655-663. https://doi.org/10.1002/joc.3370130606 + + Args: + cube (iris.cube.Cube): PSL field of dataset + dataset (str): Name of dataset + + Returns: + np.array: Array of Lamb WT for each day + """ + + # lats and lons corresponding to datapoints + # 55, 5 -> 1 + # 55, 15 -> 2 + # 50, -5 -> 3 + # 50, 5 -> 4 + # 50, 15 -> 5 + # 50, 25 -> 6 + # 45, -5 -> 7 + # 45, 5 -> 8 + # 45, 15 -> 9 + # 45, 25 -> 10 + # 40, -5 -> 11 + # 40, 5 -> 12 + # 40, 15 -> 13 + # 40, 25 -> 14 + # 35, 5 -> 15 + # 35, 15 -> 16 + + # lons: -5, 0, 5, 10, 15, 20, 25 + # lats: 35, 40, 45, 50, 55 + + logger.info('Calculating Lamb Weathertypes for %s', dataset) + + const1, const2, const3, const4 = calc_const() + + # westerly flow + westerly_flow = calc_westerly_flow(cube) + # southerly flow + southerly_flow = calc_southerly_flow(cube, const1) + # resultant flow + total_flow = calc_resultant_flow(westerly_flow, southerly_flow) + # westerly shear vorticity + westerly_shear_velocity = calc_westerly_shear_velocity(cube, const2, + const3) + # southerly shear vorticity + southerly_shear_velocity = calc_southerly_shear_velocity(cube, const4) + # total shear vorticity + total_shear_velocity = calc_total_shear_velocity(westerly_shear_velocity, + southerly_shear_velocity) + + weathertypes = np.zeros(len(total_shear_velocity)) + + for i, z_i in enumerate(total_shear_velocity): + + direction = (np.arctan(westerly_flow[i] / southerly_flow[i]) + * 180 / np.pi) # deg + if southerly_flow[i] >= 0: + direction += 180 # deg + + if direction < 0: + direction += 360 # deg + + # Lamb pure directional type + if abs(z_i) < total_flow[i]: + if 337.5 <= direction or direction < 22.5: + weathertypes[i] = 1 + elif 22.5 <= direction < 67.5: + weathertypes[i] = 2 + elif 67.5 <= direction < 112.5: + weathertypes[i] = 3 + elif 112.5 <= direction < 157.5: + weathertypes[i] = 4 + elif 157.5 <= direction < 202.5: + weathertypes[i] = 5 + elif 202.5 <= direction < 247.5: + weathertypes[i] = 6 + elif 247.5 <= direction < 292.5: + weathertypes[i] = 7 + elif 292.5 <= direction < 337.5: + weathertypes[i] = 8 + # Lamb’s pure cyclonic and anticyclonic type + elif (2 * total_flow[i]) < abs(z_i): + if z_i > 0: + weathertypes[i] = 9 + + elif z_i < 0: + weathertypes[i] = 10 + # Lambs’s synoptic/direction hybrid types + elif total_flow[i] < abs(z_i) < (2 * total_flow[i]): + if z_i > 0: + if 337.5 <= direction or direction < 22.5: + weathertypes[i] = 11 + elif 22.5 <= direction < 67.5: + weathertypes[i] = 12 + elif 67.5 <= direction < 112.5: + weathertypes[i] = 13 + elif 112.5 <= direction < 157.5: + weathertypes[i] = 14 + elif 157.5 <= direction < 202.5: + weathertypes[i] = 15 + elif 202.5 <= direction < 247.5: + weathertypes[i] = 16 + elif 247.5 <= direction < 292.5: + weathertypes[i] = 17 + elif 292.5 <= direction < 337.5: + weathertypes[i] = 18 + + elif z_i < 0: + if 337.5 <= direction or direction < 22.5: + weathertypes[i] = 19 + elif 22.5 <= direction < 67.5: + weathertypes[i] = 20 + elif 67.5 <= direction < 112.5: + weathertypes[i] = 21 + elif 112.5 <= direction < 157.5: + weathertypes[i] = 22 + elif 157.5 <= direction < 202.5: + weathertypes[i] = 23 + elif 202.5 <= direction < 247.5: + weathertypes[i] = 24 + elif 247.5 <= direction < 292.5: + weathertypes[i] = 25 + elif 292.5 <= direction < 337.5: + weathertypes[i] = 26 + # light indeterminate flow, corresponding to Lamb’s unclassified type U + elif abs(z_i) < 6 and total_flow[i] < 6: + weathertypes[i] = 27 + + return weathertypes + + +def calc_lwt_slwt_model(cfg: dict, cube: iris.cube.Cube, + data_info: dict, + predefined_slwt: bool | dict): + """Calculate Lamb WT and simplified WT for model data. + + Args: + cfg (dict): Configuration dicitonary from recipe + cube (iris.cube.Cube): PSL field of dataset + dataset (str): Name of dataset + preproc_path (str): Path of ancestors + output_file_path (str): Path to write output file + """ + + work_dir = cfg.get('work_dir') + dataset = data_info.get('dataset') + preproc_path = data_info.get('preproc_path') + output_file_path = data_info.get('output_file_path') + ensemble = data_info.get('ensemble', '') + timerange = data_info.get('timerange') + + if not os.path.exists(f'{work_dir}/{output_file_path}'): + os.makedirs(f'{work_dir}/{output_file_path}') + + lwt = wt_algorithm(cube, dataset) + + tcoord = cube.coord('time') + time_points = tcoord.units.num2date(tcoord.points) + + logger.info('Calculating simplified Lamb Weathertypes for %s', dataset) + + if not predefined_slwt: + with open(f'{work_dir}/wt_mapping_dict_ERA5.json', + 'r', + encoding='utf-8') as file: + mapping_dict_era5_f = json.load(file) + + with open(f'{work_dir}/wt_mapping_dict_E-OBS.json', + 'r', + encoding='utf-8') as file: + mapping_dict_eobs_f = json.load(file) + mapping_dict_era5 = reverse_convert_dict(mapping_dict_era5_f) + mapping_dict_eobs = reverse_convert_dict(mapping_dict_eobs_f) + + slwt_era5 = map_lwt_to_slwt(lwt, mapping_dict_era5) + slwt_eobs = map_lwt_to_slwt(lwt, mapping_dict_eobs) + else: + predefined_slwt = check_mapping_dict_format(predefined_slwt) + write_mapping_dict(work_dir, 'ERA5', predefined_slwt) + write_mapping_dict(work_dir, 'E-OBS', predefined_slwt) + slwt_era5 = map_lwt_to_slwt(lwt, predefined_slwt) + slwt_eobs = map_lwt_to_slwt(lwt, predefined_slwt) + + wt_cube = iris.cube.CubeList() + wt_cube.append(iris.cube.Cube(lwt, long_name='lwt')) + wt_cube.append(iris.cube.Cube(slwt_era5, long_name='slwt_era5')) + wt_cube.append(iris.cube.Cube(slwt_eobs, long_name='slwt_eobs')) + + wt_cube[0].add_dim_coord(tcoord, 0) + wt_cube[1].add_dim_coord(tcoord, 0) + wt_cube[2].add_dim_coord(tcoord, 0) + + iris.save(wt_cube, f'{work_dir}/{output_file_path}/{dataset}_' + f'{ensemble}_{timerange}.nc') + + # write to csv file + d = { + 'date': time_points[:], + 'lwt': np.int8(lwt), + 'slwt_ERA5': np.int8(slwt_era5), + 'slwt_EOBS': np.int8(slwt_eobs), + } + df = pd.DataFrame(data=d) + df.to_csv(f'{work_dir}/{output_file_path}/{dataset}_{ensemble}_' + f'{timerange}.csv', + index=False) + + ancestors = [ + preproc_path, f'{work_dir}/wt_mapping_dict_ERA5.json', + f'{work_dir}/wt_mapping_dict_E-OBS.json' + ] + provenance_record = get_provenance_record('Lamb Weathertypes', + ancestors, + ['Lamb Weathertypes'], + False, False) + + log_provenance(f'{work_dir}/{output_file_path}/{dataset}', + cfg, provenance_record) + + +def get_colormap(colormap_string: str) -> ListedColormap: + """Get colormaps based on string. + + Args: + colormap_string (str): String to get Colormaps for either + psl, tas or precipitation. + + Returns: + ListedColormap: Choosen Colormap + """ + + misc_seq_2_disc = [ + (230 / 255, 240 / 255, 240 / 255), + (182 / 255, 217 / 255, 228 / 255), + (142 / 255, 192 / 255, 226 / 255), + (118 / 255, 163 / 255, 228 / 255), + (116 / 255, 130 / 255, 222 / 255), + (121 / 255, 97 / 255, 199 / 255), + (118 / 255, 66 / 255, 164 / 255), + (107 / 255, 40 / 255, 121 / 255), + (86 / 255, 22 / 255, 75 / 255), + (54 / 255, 14 / 255, 36 / 255), + ] + + temp_seq_disc = [ + (254 / 255, 254 / 255, 203 / 255), + (251 / 255, 235 / 255, 153 / 255), + (244 / 255, 204 / 255, 104 / 255), + (235 / 255, 167 / 255, 84 / 255), + (228 / 255, 134 / 255, 80 / 255), + (209 / 255, 98 / 255, 76 / 255), + (164 / 255, 70 / 255, 66 / 255), + (114 / 255, 55 / 255, 46 / 255), + (66 / 255, 40 / 255, 24 / 255), + (25 / 255, 25 / 255, 0 / 255), + ] + + prec_seq_disc = [ + (255 / 255, 255 / 255, 229 / 255), + (217 / 255, 235 / 255, 213 / 255), + (180 / 255, 216 / 255, 197 / 255), + (142 / 255, 197 / 255, 181 / 255), + (105 / 255, 177 / 255, 165 / 255), + (67 / 255, 158 / 255, 149 / 255), + (44 / 255, 135 / 255, 127 / 255), + (29 / 255, 110 / 255, 100 / 255), + (14 / 255, 85 / 255, 74 / 255), + (0 / 255, 60 / 255, 48 / 255), + ] + + if colormap_string == 'psl': + return ListedColormap(misc_seq_2_disc) + if colormap_string == 'prcp': + return ListedColormap(prec_seq_disc) + if colormap_string == 'temp': + return ListedColormap(temp_seq_disc) + + return None + + +def plot_maps(wt: np.array, cfg: dict, cube: iris.cube.Cube, + data_info: dict, mode: str): + """Plot maps. + + Args: + wt (np.array): WT for which statistic was calculated + cfg (dict): Configuration dicitonary from recipe + cube (iris.cube.Cube): Data to be plotted + dataset (str): Name of dataset + var_name (str): Name of variable to be plotted + wt_string (str): lwt, slwt_ERA5 or slwt_EOBS + slwt are calculated based on ERA5 or EOBS + precipitation data, respectively + mode (str): Statistics that is used + """ + + dataset = data_info.get('dataset') + var_name = data_info.get('var') + wt_string = data_info.get('wt_string') + ensemble = data_info.get('ensemble') + timerange = data_info.get('timerange') + + logger.info('Plotting %s %s %s for %s %s', dataset, var_name, mode, + wt_string, wt) + + local_path = f"{cfg.get('plot_dir')}/{mode}" + + if not os.path.exists(f'{local_path}'): + os.makedirs(f'{local_path}') + + ax = plt.axes(projection=ccrs.PlateCarree()) + + if var_name == 'psl': + psl_cmap = get_colormap('psl') + plt.title(f'{dataset} {ensemble}, {var_name} {mode}\n' + + f'{timerange}, wt: {wt}') + unit = '[hPa]' + im = iplt.contourf(cube / 100, cmap=psl_cmap) + cb = plt.colorbar(im) + cb.ax.tick_params(labelsize=8) + cb.set_label(label=f'{var_name} {mode} {unit}') + elif var_name == 'pr': + prcp_cmap = get_colormap('prcp') + if dataset == 'ERA5': + unit = '[m]' + plt.title(f'{dataset} {ensemble}, total {var_name} {mode}\n' + + f'{timerange}, wt: {wt}') + else: + unit = '[kg m-2 s-1]' + plt.title(f'{dataset} {ensemble}, {var_name} flux {mode}\n' + + f'{timerange}, wt: {wt}') + im = iplt.contourf(cube, cmap=prcp_cmap) + cb = plt.colorbar(im) + cb.ax.tick_params(labelsize=8) + cb.set_label(label=f'{var_name} {mode} {unit}') + elif var_name == 'tas': + temp_cmap = get_colormap('temp') + unit = '[K]' + plt.title(f'{dataset} {ensemble}, 1000 hPa {var_name} {mode}\n' + + f'{timerange}, wt: {wt}') + im = iplt.contourf(cube, cmap=temp_cmap) + cb = plt.colorbar(im) + cb.ax.tick_params(labelsize=8) + cb.set_label(label=f'{var_name} {mode} {unit}') + + gl = ax.gridlines( + crs=ccrs.PlateCarree(), + draw_labels=True, + linewidth=0.5, + color='gray', + alpha=0.5, + linestyle='--', + ) + gl.left_labels = True + gl.bottom_labels = True + gl.top_labels = False + gl.right_labels = False + gl.xlines = True + gl.ylocator = mticker.FixedLocator(np.arange(20, 70, 5)) + gl.xlocator = mticker.FixedLocator([-10, -5, 0, 5, 10, 15]) + gl.xformatter = LONGITUDE_FORMATTER + gl.yformatter = LATITUDE_FORMATTER + gl.xlabel_style = {'size': 8, 'color': 'black'} + gl.ylabel_style = {'color': 'black', 'size': 8} + + ax.set_extent([-15, 20, 27.5, 62.5]) + + ax.coastlines() + ax.add_feature(cfeature.BORDERS, linestyle=':') + + plt.savefig(f'{local_path}/{wt_string}_{wt}_{dataset}_{ensemble}' + f'_{var_name}_{mode}_{timerange}.png') + plt.savefig(f'{local_path}/{wt_string}_{wt}_{dataset}_{ensemble}_' + f'{var_name}_{mode}_{timerange}.pdf') + plt.close() + + +def rmse(subarray1: np.array, subarray2: np.array) -> np.array: + """Calculate rsme. + + Args: + subarray1 (np.array): array1 + subarray2 (np.array): array2 + + Returns: + np.array: rsme array + """ + return np.sqrt(np.mean((subarray1 - subarray2)**2)) + + +def convert_dict(dict_: dict) -> dict: + """Convert mapping dictionary from {lwt: slwt, ...} format to {slwt: [lwt1, + lwt2], ...}. + + Args: + dict_ (dict): Dict in the {lwt: slwt, ...} format + + Returns: + dict: Dict in the {slwt: [lwt1, lwt2], ...} format + """ + + new_dict = {} + for dataset, value in dict_.items(): + if value not in new_dict: + new_dict[value] = [] + new_dict[value].append(dataset) + return new_dict + + +def reverse_convert_dict(dict_: dict) -> dict: + """Convert mapping dictionary from {slwt: [lwt1, lwt2], ...} format to + {lwt: slwt, ...}. + + Args: + original_dict (dict): Dict in the {slwt: [lwt1, lwt2], ...}format + + Returns: + dict: Dict in the format {lwt: slwt, ...} + """ + new_dict = {} + for key, value_list in dict_.items(): + for original_key in value_list: + new_dict[original_key] = key + return new_dict + + +def plot_corr_rmse_heatmaps(cfg: dict, pattern_correlation_matrix: np.array, + rmse_matrix: np.array, dataset: str, + timerange: str): + """Plot heatmaps for correlation and rmse matrices. + + Args: + cfg (dict): cfg dict from recipe + pattern_correlation_matrix (np.array): pattern correlation matrix + rmse_matrix (np.array): rmse matrix + dataset (str): string of dataset + """ + + output_path = f'{cfg.get("plot_dir")}/heatmaps' + + if not os.path.exists(f'{output_path}'): + os.makedirs(f'{output_path}') + + labels = np.arange(1, 28) + + mask = np.zeros_like(pattern_correlation_matrix) + mask[np.triu_indices_from(mask)] = True + with sns.axes_style('white'): + plt.figure(figsize=(10, 10)) + plt.title(f'Correlation Matrix , {dataset}, {timerange}') + levels = np.linspace(np.min(pattern_correlation_matrix), + np.max(pattern_correlation_matrix), 9) + ax = sns.heatmap(pattern_correlation_matrix, + mask=mask, + square=True, + annot=True, + annot_kws={'size': 6}, + cmap='seismic', + xticklabels=labels, + yticklabels=labels, + cbar_kws={ + 'ticks': levels, + 'shrink': 0.8, + 'format': '%.2f' + }) + ax.set_xlabel('lwt', fontsize=8) + ax.set_xticklabels(ax.get_xticklabels(), rotation=0) + ax.set_yticklabels(ax.get_yticklabels(), rotation=0) + ax.set_ylabel('lwt', fontsize=8) + plt.tight_layout() + plt.savefig(f'{output_path}/correlation_matrix_{dataset}_' + f'{timerange}.png') + plt.savefig(f'{output_path}/correlation_matrix_{dataset}_' + f'{timerange}.pdf') + plt.close() + + mask = np.zeros_like(rmse_matrix) + mask[np.triu_indices_from(mask)] = True + with sns.axes_style('white'): + plt.figure(figsize=(10, 10)) + plt.title(f'RMSE Matrix, {dataset}, {timerange}') + levels = np.linspace(np.min(rmse_matrix), np.max(rmse_matrix), 9) + ax = sns.heatmap(rmse_matrix, + mask=mask, + square=True, + annot=True, + annot_kws={'size': 5}, + cmap='seismic', + xticklabels=labels, + yticklabels=labels, + cbar_kws={ + 'ticks': levels, + 'shrink': 0.8, + 'format': '%.2f' + }) + ax.set_xlabel('lwt', fontsize=8) + ax.set_xticklabels(ax.get_xticklabels(), rotation=0) + ax.set_yticklabels(ax.get_yticklabels(), rotation=0) + ax.set_ylabel('lwt', fontsize=8) + plt.tight_layout() + plt.savefig(f'{output_path}/rmse_matrix_{dataset}_' + f'{timerange}.png') + plt.savefig(f'{output_path}/rmse_matrix_{dataset}_' + f'{timerange}.pdf') + plt.close() + + +def write_corr_rmse_to_csv(cfg: dict, pattern_correlation_matrix: np.array, + rmse_matrix: np.array, dataset: str): + """Write correlation and rsme matrix to csv files. + + Args: + cfg (dict): Configuration dictionary from recipe + pattern_correlation_matrix (np.array): Correlation matrix + rmse_matrix (np.array): RSME matrix + dataset (str): Name of dataset + """ + + logger.info('Writing corr and rsme matrices for %s', dataset) + + work_dir = cfg.get('work_dir') + + df_corr = pd.DataFrame(pattern_correlation_matrix) + df_corr.index = range(1, len(df_corr) + 1) + df_corr.columns = range(1, len(df_corr.columns) + 1) + df_corr.to_csv(f'{work_dir}/correlation_matrix_{dataset}.csv', + index_label='Index') + + df_rmse = pd.DataFrame(rmse_matrix) + df_rmse.index = range(1, len(df_rmse) + 1) + df_rmse.columns = range(1, len(df_rmse.columns) + 1) + df_rmse.to_csv(f'{work_dir}/rmse_matrix_{dataset}.csv', + index_label='Index') + + +def process_prcp_mean(cfg: dict, data: np.array, correlation_threshold: float, + rmse_threshold: float, dataset: str, + timerange: str) -> list: + """Process precipitation fields for specified area to get a list of + selected pairs of weathertypes with the highest correlation (higher than + correlation_threshold) and smallest RSME (smaller than rsme_threshold) for + further processing and simplifying the WT. + + Args: + cfg (dict): Configuration dictionary from recipe + data (np.array): Precipitation data + correlation_threshold (float): Correlation threshold + rmse_threshold (float): RMSE threshold + dataset (str): Name of dataset + + Returns: + list: Selected pairs of WT. This is passed to get_mapping_dict + """ + + logger.info('Calculating corr and rsme matrices for %s', dataset) + + selected_pairs = [] + pattern_correlation_matrix = np.ma.corrcoef(data) + n = len(data) + rmse_matrix = np.zeros((n, n)) + + for i in range(n): + for j in range(i + 1, n): + rmse_matrix[i][j] = rmse(data[i], data[j]) + rmse_matrix[j][i] = rmse_matrix[i][j] + if (pattern_correlation_matrix[i][j] >= correlation_threshold + and rmse_matrix[i][j] <= rmse_threshold): + selected_pairs.append( + ((i + 1, j + 1), pattern_correlation_matrix[i][j], + rmse_matrix[i][j])) + + # write matrices to csv + write_corr_rmse_to_csv(cfg, pattern_correlation_matrix, rmse_matrix, + dataset) + # plot heatmaps for matrices + plot_corr_rmse_heatmaps(cfg, pattern_correlation_matrix, rmse_matrix, + dataset, timerange) + + return selected_pairs + + +def calc_wt_means(cfg: dict, cube: iris.cube.Cube, + wt_cubes: iris.cube.CubeList, data_info: dict): + """Calculate means of fields of each weathertype. + + Args: + cfg (dict): Configuration dictionary from recipe + cube (iris.cube.Cube): Cube with variable data + wt_cubes (iris.cube.CubeList): List of cubes of lwt, slwt_ERA5 + and slwt_EOBS + dataset (str): Name of dataset + var_name (str): Name of variable + wt_string (str): lwt, slwt_ERA5 or slwt_EOBS + slwt are calculated based on ERA5 or EOBS + precipitation data, respectively + preproc_path (str): Ancestor path + """ + + dataset = data_info.get('dataset') + var_name = data_info.get('var') + wt_string = data_info.get('wt_string') + preproc_path = data_info.get('preproc_path') + ensemble = data_info.get('ensemble') + timerange = data_info.get('timerange') + + logger.info('Calculating %s %s means for %s', dataset, var_name, wt_string) + + work_dir = cfg.get('work_dir') + + num_slwt = 0 + target_indices = [] + lwt, slwt_eobs, slwt_era5 = [], [], [] + + if wt_string == 'slwt_ERA5': + slwt_era5_cube = wt_cubes[1] + tcoord = slwt_era5_cube.coord('time') + slwt_era5 = slwt_era5_cube.data[:] + num_slwt = len(np.unique(slwt_era5)) + elif wt_string == 'slwt_EOBS': + slwt_eobs_cube = wt_cubes[2] + tcoord = slwt_eobs_cube.coord('time') + slwt_eobs = slwt_eobs_cube.data[:] + num_slwt = len(np.unique(slwt_eobs)) + elif wt_string == 'lwt': + lwt_cube = wt_cubes[0] + tcoord = lwt_cube.coord('time') + lwt = lwt_cube.data[:] + + # num_slwt_era5 = len(np.unique(slwt_era5)) + # num_slwt_eobs = len(np.unique(slwt_eobs)) + + # if num_slwt_eobs != num_slwt_era5: + # logger.info('calc_wt_means - CAUTION: unequal number of \ + # slwt_era5 (%s) \ and slwt_eobs (%s)!', + # num_slwt_era5, num_slwt_eobs) + + if 'slwt' in wt_string: + for wt in range(1, num_slwt + 1): + if wt_string == 'slwt_ERA5': + target_indices = np.where(slwt_era5 == wt) + elif wt_string == 'slwt_EOBS': + target_indices = np.where(slwt_eobs == wt) + else: + logger.info('WT_STRING not supported!') + if len(target_indices[0]) < 1: + logger.info( + 'calc_wt_means - CAUTION: Skipped %s %s \ + for dataset %s!', wt_string, wt, dataset) + continue + dates = [ + tcoord.units.num2date(tcoord.points[i]) for i in target_indices + ] + extracted_cube = cube.extract( + iris.Constraint(time=lambda t: t.point in dates[0])) + wt_cube_mean = extracted_cube.collapsed('time', iris.analysis.MEAN) + plot_maps(wt, cfg, wt_cube_mean, data_info, 'mean') + elif wt_string == 'lwt': + for wt in range(1, 28): + target_indices = np.where(lwt == wt) + if len(target_indices[0]) < 1: + logger.info( + 'calc_wt_means - CAUTION: Skipped lwt %s \ + for dataset %s!', wt, dataset) + continue + dates = [ + tcoord.units.num2date(tcoord.points[i]) for i in target_indices + ] + extracted_cube = cube.extract( + iris.Constraint(time=lambda t: t.point in dates[0])) + wt_cube_mean = extracted_cube.collapsed('time', iris.analysis.MEAN) + plot_maps(wt, cfg, wt_cube_mean, data_info, 'mean') + else: + logger.info('WT_STRING NOT SUPPORTED.') + + ancestors = [f'{preproc_path}', f'{work_dir}/ERA5.nc'] + provenance_record = get_provenance_record(f'{var_name} means for \ + {wt_string}', + ancestors, + [var_name], + ['map'], ['mean']) + + local_path = f"{cfg.get('plot_dir')}/mean" + + log_provenance(f'{local_path}/{wt_string}_{wt}_{dataset}_{ensemble}' + f'_{var_name}_mean_{timerange}', cfg, provenance_record) + + +def calc_wt_anomalies(cfg: dict, cube: iris.cube.Cube, + wt_cubes: iris.cube.CubeList, data_info: dict): + """Calculate anomalies of fields of each weathertype. + + Args: + cfg (dict): Configuration dictionary from recipe + cube (iris.cube.Cube): Cube with variable data + wt_cubes (iris.cube.CubeList): List of cubes of lwt, slwt_ERA5 + and slwt_EOBS + dataset (str): Name of dataset + var_name (str): Name of variable + wt_string (str): lwt, slwt_ERA5 or slwt_EOBS + slwt are calculated based on ERA5 or EOBS + precipitation data, respectively + preproc_path (str): Ancestor path + """ + + work_dir = cfg.get('work_dir') + dataset = data_info.get('dataset') + var_name = data_info.get('var_name') + wt_string = data_info.get('wt_string') + preproc_path = data_info.get('preproc_path') + ensemble = data_info.get('ensemble') + timerange = data_info.get('timerange') + + logger.info('Calculating %s %s anomalies for %s', dataset, var_name, + wt_string) + + target_indices = [] + lwt, slwt_eobs, slwt_era5 = [], [], [] + + if wt_string == 'slwt_ERA5': + slwt_era5_cube = wt_cubes[1] + tcoord = slwt_era5_cube.coord('time') + slwt_era5 = slwt_era5_cube.data[:] + elif wt_string == 'slwt_EOBS': + slwt_eobs_cube = wt_cubes[2] + tcoord = slwt_eobs_cube.coord('time') + slwt_eobs = slwt_eobs_cube.data[:] + elif wt_string == 'lwt': + lwt_cube = wt_cubes[0] + tcoord = lwt_cube.coord('time') + lwt = lwt_cube.data[:] + + num_slwt_era5 = len(np.unique(slwt_era5)) + num_slwt_eobs = len(np.unique(slwt_eobs)) + + if num_slwt_eobs != num_slwt_era5: + logger.info( + 'calc_wt_means - CAUTION: unequal number of \ + slwt_era5 (%s) and slwt_eobs (%s)!', num_slwt_era5, + num_slwt_eobs) + + if 'slwt' in wt_string: + for wt in range(1, max(num_slwt_era5, num_slwt_eobs)): + if wt_string == 'slwt_ERA5': + target_indices = np.where(slwt_era5 == wt) + elif wt_string == 'slwt_EOBS': + target_indices = np.where(slwt_eobs == wt) + else: + logger.info('WT_STRING not supported!') + if len(target_indices[0]) < 1: + logger.info( + 'calc_wt_anomalies - CAUTION: Skipped wt %s \ + for dataset %s!', wt, dataset) + continue + dates = [ + tcoord.units.num2date(tcoord.points[i]) for i in target_indices + ] + extracted_cube = cube.extract( + iris.Constraint(time=lambda t: t.point in dates[0])) + wt_cube_mean = extracted_cube.collapsed('time', iris.analysis.MEAN) + plot_maps( + wt, cfg, + cube.collapsed('time', iris.analysis.MEAN) - wt_cube_mean, + data_info, 'anomaly') + elif wt_string == 'lwt': + for wt in range(1, 28): + target_indices = np.where(lwt == wt) + if len(target_indices[0]) < 1: + logger.info( + 'calc_wt_anomalies - CAUTION: Skipped wt %s \ + for dataset %s!', wt, dataset) + continue + dates = [ + tcoord.units.num2date(tcoord.points[i]) for i in target_indices + ] + extracted_cube = cube.extract( + iris.Constraint(time=lambda t: t.point in dates[0])) + wt_cube_mean = extracted_cube.collapsed('time', iris.analysis.MEAN) + plot_maps( + wt, cfg, + cube.collapsed('time', iris.analysis.MEAN) - wt_cube_mean, + data_info, 'anomaly') + else: + logger.info('WT_STRING NOT SUPPORTED.') + + ancestors = [f'{preproc_path}', f'{work_dir}/ERA5.nc'] + provenance_record = get_provenance_record(f'{var_name} anomaly for \ + {wt_string}', + ancestors, + [var_name], + ['map'], ['anomaly']) + + local_path = f"{cfg.get('plot_dir')}/anomaly" + + log_provenance(f'{local_path}/{wt_string}_{wt}_{dataset}_{ensemble}' + f'_{var_name}_anomaly__{timerange}', cfg, + provenance_record) + + +def calc_wt_std(cfg: dict, cube: iris.cube.Cube, + wt_cubes: iris.cube.CubeList, data_info: dict): + """Calculate standard deviation of fields of each weathertype. + + Args: + cfg (dict): Configuration dictionary from recipe + cube (iris.cube.Cube): Cube with variable data + wt_cubes (iris.cube.CubeList): List of cubes of lwt, slwt_ERA5 + and slwt_EOBS + dataset (str): Name of dataset + var_name (str): Name of variable + wt_string (str): lwt, slwt_ERA5 or slwt_EOBS + slwt are calculated based on ERA5 or EOBS + precipitation data, respectively + preproc_path (str): Ancestor path + """ + + work_dir = cfg.get('work_dir') + dataset = data_info.get('dataset') + var_name = data_info.get('var_name') + wt_string = data_info.get('wt_string') + preproc_path = data_info.get('preproc_path') + ensemble = data_info.get('ensemble') + timerange = data_info.get('timerange') + + logger.info('Calculating %s %s standard deviation for %s', dataset, + var_name, wt_string) + + target_indices = [] + lwt, slwt_eobs, slwt_era5 = [], [], [] + + if wt_string == 'slwt_ERA5': + slwt_era5_cube = wt_cubes[1] + tcoord = slwt_era5_cube.coord('time') + slwt_era5 = slwt_era5_cube.data[:] + elif wt_string == 'slwt_EOBS': + slwt_eobs_cube = wt_cubes[2] + tcoord = slwt_eobs_cube.coord('time') + slwt_eobs = slwt_eobs_cube.data[:] + elif wt_string == 'lwt': + lwt_cube = wt_cubes[0] + tcoord = lwt_cube.coord('time') + lwt = lwt_cube.data[:] + + num_slwt_era5 = len(np.unique(slwt_era5)) + num_slwt_eobs = len(np.unique(slwt_eobs)) + + if num_slwt_eobs != num_slwt_era5: + logger.info( + 'calc_wt_means - CAUTION: unequal number of \ + slwt_era5 (%s) and slwt_eobs (%s)!', num_slwt_era5, + num_slwt_eobs) + + if 'slwt' in wt_string: + for wt in range(1, max(num_slwt_era5, num_slwt_eobs)): + if wt_string == 'slwt_ERA5': + target_indices = np.where(slwt_era5 == wt) + elif wt_string == 'slwt_EOBS': + target_indices = np.where(slwt_eobs == wt) + else: + logger.info('WT_STRING not supported!') + if len(target_indices[0]) < 1: + logger.info( + 'calc_slwt_obs - CAUTION: Skipped wt %s \ + for dataset %s!', wt, dataset) + continue + dates = [ + tcoord.units.num2date(tcoord.points[i]) for i in target_indices + ] + extracted_cube = cube.extract( + iris.Constraint(time=lambda t: t.point in dates[0])) + wt_cube_std = extracted_cube.collapsed('time', + iris.analysis.STD_DEV) + plot_maps(wt, cfg, wt_cube_std, data_info, + 'stddev') + elif wt_string == 'lwt': + for wt in range(1, 28): + target_indices = np.where(lwt == wt) + if len(target_indices[0]) < 1: + logger.info( + 'calc_wt_std - CAUTION: Skipped wt %s \ + for dataset %s!', wt, dataset) + continue + dates = [ + tcoord.units.num2date(tcoord.points[i]) for i in target_indices + ] + extracted_cube = cube.extract( + iris.Constraint(time=lambda t: t.point in dates[0])) + wt_cube_std = extracted_cube.collapsed('time', + iris.analysis.STD_DEV) + plot_maps(wt, cfg, wt_cube_std, data_info, + 'stddev') + else: + logger.info('WT_STRING NOT SUPPORTED.') + + ancestors = [f'{preproc_path}', f'{work_dir}/ERA5.nc'] + provenance_record = get_provenance_record(f'{var_name} standard \ + deviation for \ + {wt_string}', + ancestors, + [var_name], + ['map'], ['stddev']) + + local_path = f"{cfg.get('plot_dir')}/stddev" + + log_provenance(f'{local_path}/{wt_string}_{wt}_{dataset}_{ensemble}' + f'_{var_name}_stddev_{timerange}', cfg, provenance_record) + + +def run_predefined_slwt(work_dir: str, dataset_name: str, + lwt: np.array, predefined_slwt: dict): + predefined_slwt = check_mapping_dict_format(predefined_slwt) + write_mapping_dict(work_dir, dataset_name, predefined_slwt) + write_mapping_dict(work_dir, "E-OBS", predefined_slwt) + slwt_era5 = map_lwt_to_slwt(lwt, predefined_slwt) + slwt_eobs = map_lwt_to_slwt(lwt, predefined_slwt) + + return slwt_era5, slwt_eobs + + +def combine_wt_to_file(cfg: dict, wt_list: list, + cube: iris.cube.Cube, + file_name: str): + """Combine lwt and slwt arrays to one file. + + Args: + cfg (dict): Configuration dictionary from recipe + lwt (np.array): lwt array + slwt_era5 (np.array): slwt_era5 array + slwt_eobs (np.array): slwt_eobs array + cube (iris.cube.Cube): Cube of data to keep time coordinate + file_name (str): Name of output file + """ + + lwt = wt_list[0] + slwt_era5 = wt_list[1] + slwt_eobs = wt_list[2] + + logger.info('Writing weathertypes to %s', file_name) + + tcoord = cube.coord('time') + time_points = tcoord.units.num2date(tcoord.points) + + write_path = cfg.get('work_dir') + + wt_cube = iris.cube.CubeList() + wt_cube.append(iris.cube.Cube(lwt, long_name='lwt')) + wt_cube.append(iris.cube.Cube(slwt_era5, long_name='slwt_era5')) + wt_cube.append(iris.cube.Cube(slwt_eobs, long_name='slwt_eobs')) + + wt_cube[0].add_dim_coord(tcoord, 0) + wt_cube[1].add_dim_coord(tcoord, 0) + wt_cube[2].add_dim_coord(tcoord, 0) + + iris.save(wt_cube, f'{write_path}/{file_name}.nc') + + # write to csv file + d = { + 'date': time_points[:], + 'lwt': np.int8(lwt), + 'slwt_era5': np.int8(slwt_era5), + 'slwt_eobs': np.int8(slwt_eobs), + } + df = pd.DataFrame(data=d) + df.to_csv(write_path + f'/{file_name}.csv', index=False) + + +def write_lwt_to_file(cfg: dict, lwt: np.array, cube: iris.cube.Cube, + file_name: str): + """Write only lwt to file. + + Args: + cfg (dict): Configuration dictionary from recipe + lwt (np.array): lwt array + cube (iris.cube.Cube): Cube to keep time coordinate + file_name (str): Output filename + """ + + logger.info('Writing Lamb Weathertype to %s', file_name) + + tcoord = cube.coord('time') + time_points = tcoord.units.num2date(tcoord.points) + + write_path = cfg.get('work_dir') + + wt_cube = iris.cube.CubeList() + wt_cube.append(iris.cube.Cube(np.int8(lwt), long_name='lwt')) + + wt_cube[0].add_dim_coord(tcoord, 0) + iris.save(wt_cube, f'{write_path}/{file_name}.nc') + + # write to csv file + d = {'date': time_points[:], 'lwt': np.int8(lwt)} + df = pd.DataFrame(data=d) + df.to_csv(write_path + f'/{file_name}.csv', index=False) + + +def calc_lwt_model(cfg: dict, cube: iris.cube.Cube, dataset: str, + data_info: dict): + """Calculate lwt for model data. + + Args: + cfg (dict): Configuration dictionary from recipe + cube (iris.cube.Cube): Cube to keep time coordinate + dataset (str): Name of dataset + output_file_path (str): Path to write output + """ + + work_dir = cfg.get('work_dir') + dataset = data_info.get('dataset') + output_file_path = data_info.get('output_file_path') + ensemble = data_info.get('ensemble', '') + timerange = data_info.get('timerange') + + if not os.path.exists(f'{work_dir}/{output_file_path}'): + os.makedirs(f'{work_dir}/{output_file_path}') + + wt = wt_algorithm(cube, dataset) + + tcoord = cube.coord('time') + time_points = tcoord.units.num2date(tcoord.points) + + wt_cube = iris.cube.CubeList() + wt_cube.append(iris.cube.Cube(wt, long_name='lwt')) + + logger.info( + 'Writing Lamb Weathertype for %s \ + to file %s.nc', dataset, dataset) + + wt_cube[0].add_dim_coord(tcoord, 0) + + iris.save(wt_cube, f'{work_dir}/{output_file_path}/{dataset}' + f'_{ensemble}_{timerange}.nc') + + # write to csv file + d = {'date': time_points[:], 'lwt': np.int8(wt)} + df = pd.DataFrame(data=d) + df.to_csv(f'{work_dir}/{output_file_path}/{dataset}_{ensemble}_' + f'{timerange}.csv', index=False) + + ancestors = [f'{work_dir}/wt_mapping_dict_ERA5.json', + f'{work_dir}/wt_mapping_dict_E-OBS.json'] + provenance_record = get_provenance_record('Lamb Weathertypes', + ancestors, + ['Lamb Weathertypes'], + False, False) + + log_provenance(f'{work_dir}/{output_file_path}/{dataset}', + cfg, provenance_record) + + +def map_lwt_to_slwt(lwt: np.array, mapping_dict: dict) -> np.array: + """Map lwt array to slwt array. + + Args: + lwt (np.array): array of lwt + mapping_dict (dict): mapping dictionary in {lwt: slwt, ...} format + + Returns: + np.array: array of slwt + """ + + return np.array([np.int8(mapping_dict.get(value, 0)) for value in lwt]) + + +def check_mapping_dict_format(mapping_dict: dict) -> dict: + """Check format of mapping dict and return in {lwt: slwt, ...} format. + + Args: + mapping_dict (dict): mapping dict in any format + + Returns: + dict: mapping dict in {lwt: slwt, ...} format + """ + + if isinstance(mapping_dict.get(list(mapping_dict.keys())[0]), list): + dict_ = reverse_convert_dict(mapping_dict) + else: + dict_ = mapping_dict + + return dict_ diff --git a/esmvaltool/recipes/recipe_weathertyping_CMIP6.yml b/esmvaltool/recipes/recipe_weathertyping_CMIP6.yml new file mode 100644 index 0000000000..15cac63542 --- /dev/null +++ b/esmvaltool/recipes/recipe_weathertyping_CMIP6.yml @@ -0,0 +1,273 @@ +documentation: + title: Weathertyping algorithm + + description: | + A diagnostic to calculate Lamb weathertypes over a given region. Furthermore, + correlations between weathertypes and precipitation patterns over a given area can be calculated + and 'combined' or 'simplified' weathertypes can be derived. Additionally, mean fields, as well as + anomalies and standard deviations can be plotted. + + authors: + - jury_martin + - kroissenbrunner_thomas + + maintainer: + - kroissenbrunner_thomas + + projects: + - preval + + references: + - maraun21jgr + - jones93ijc + +datasets_psl_CMIP6: &datasets_psl_cmip6 + - {institute: AS-RCEC, dataset: TaiESM1, ensemble: r1i1p1f1, grid: gn, esgf_version: v20200626} + - {institute: AWI, dataset: AWI-ESM-1-1-LR, ensemble: r1i1p1f1, grid: gn, esgf_version: v20200212} + - {institute: BCC, dataset: BCC-CSM2-MR, ensemble: r1i1p1f1, grid: gn, esgf_version: v20181126} + - {institute: BCC, dataset: BCC-ESM1, ensemble: r1i1p1f1, grid: gn, esgf_version: v20181220} + - {institute: CAS, dataset: FGOALS-f3-L, ensemble: r1i1p1f1, grid: gr, esgf_version: v20191019} + - {institute: CAS, dataset: FGOALS-g3, ensemble: r1i1p1f1, grid: gn, esgf_version: v20190826} + - {institute: CCCR-IITM, dataset: IITM-ESM, ensemble: r1i1p1f1, grid: gn, esgf_version: v20210203} + - {institute: CCCma, dataset: CanESM5, ensemble: r1i1p1f1, grid: gn, esgf_version: v20190429} + - {institute: CMCC, dataset: CMCC-CM2-HR4, ensemble: r1i1p1f1, grid: gn, esgf_version: v20200904} + - {institute: CMCC, dataset: CMCC-CM2-SR5, ensemble: r1i1p1f1, grid: gn, esgf_version: v20200616} + - {institute: CMCC, dataset: CMCC-ESM2, ensemble: r1i1p1f1, grid: gn, esgf_version: v20210114} + - {institute: CNRM-CERFACS, dataset: CNRM-CM6-1, ensemble: r1i1p1f2, grid: gr, esgf_version: v20180917} + - {institute: CSIRO-ARCCSS, dataset: ACCESS-CM2, ensemble: r1i1p1f1, grid: gn, esgf_version: v20191108} + - {institute: CSIRO, dataset: ACCESS-ESM1-5, ensemble: r1i1p1f1, grid: gn, esgf_version: v20191115} + - {institute: EC-Earth-Consortium, dataset: EC-Earth3-AerChem, ensemble: r1i1p1f1, grid: gr, esgf_version: v20200624} + - {institute: EC-Earth-Consortium, dataset: EC-Earth3-CC, ensemble: r1i1p1f1, grid: gr, esgf_version: v20210113} + - {institute: EC-Earth-Consortium, dataset: EC-Earth3-Veg-LR, ensemble: r1i1p1f1, grid: gr, esgf_version: v20200217} + - {institute: EC-Earth-Consortium, dataset: EC-Earth3-Veg, ensemble: r1i1p1f1, grid: gr, esgf_version: v20211207} + - {institute: EC-Earth-Consortium, dataset: EC-Earth3, ensemble: r1i1p1f1, grid: gr, esgf_version: v20200310} + - {institute: HAMMOZ-Consortium, dataset: MPI-ESM-1-2-HAM, ensemble: r1i1p1f1, grid: gn, esgf_version: v20190627} + - {institute: INM, dataset: INM-CM4-8, ensemble: r1i1p1f1, grid: gr1, esgf_version: v20190530} + - {institute: INM, dataset: INM-CM5-0, ensemble: r1i1p1f1, grid: gr1, esgf_version: v20190610} + - {institute: IPSL, dataset: IPSL-CM5A2-INCA, ensemble: r1i1p1f1, grid: gr, esgf_version: v20200729} +# - {institute: IPSL, dataset: IPSL-CM6A-LR-INCA, ensemble: r1i1p1f1, grid: gr, esgf_version: v20210216} + - {institute: IPSL, dataset: IPSL-CM6A-LR, ensemble: r1i1p1f1, grid: gr, esgf_version: v20180803} + - {institute: KIOST, dataset: KIOST-ESM, ensemble: r1i1p1f1, grid: gr1, esgf_version: v20210601} + - {institute: MIROC, dataset: MIROC-ES2L, ensemble: r1i1p1f2, grid: gn, esgf_version: v20191129} + - {institute: MIROC, dataset: MIROC6, ensemble: r1i1p1f1, grid: gn, esgf_version: v20191016} +# - {institute: MPI-M, dataset: ICON-ESM-LR, ensemble: r1i1p1f1, grid: gn, esgf_version: v20210215} + - {institute: MPI-M, dataset: MPI-ESM1-2-HR, ensemble: r1i1p1f1, grid: gn, esgf_version: v20190710} + - {institute: MPI-M, dataset: MPI-ESM1-2-LR, ensemble: r1i1p1f1, grid: gn, esgf_version: v20190710} + - {institute: MRI, dataset: MRI-ESM2-0, ensemble: r1i1p1f1, grid: gn, esgf_version: v20190603} + - {institute: NASA-GISS, dataset: GISS-E2-2-G, ensemble: r1i1p1f1, grid: gn, esgf_version: v20191120, start_year: 1970} + - {institute: NCAR, dataset: CESM2-FV2, ensemble: r1i1p1f1, grid: gn, esgf_version: v20191120} + - {institute: NCAR, dataset: CESM2-WACCM-FV2, ensemble: r1i1p1f1, grid: gn, esgf_version: v20191120} +# - {institute: NCAR, dataset: CESM2-WACCM, ensemble: r1i1p1f1, grid: gn, esgf_version: v20190227} + - {institute: NCAR, dataset: CESM2, ensemble: r1i1p1f1, grid: gn, esgf_version: v20190308} + - {institute: NCC, dataset: NorCPM1, ensemble: r1i1p1f1, grid: gn, esgf_version: v20200724} + - {institute: NCC, dataset: NorESM2-LM, ensemble: r1i1p1f1, grid: gn, esgf_version: v20190815} + - {institute: NCC, dataset: NorESM2-MM, ensemble: r1i1p1f1, grid: gn, esgf_version: v20191108} + - {institute: NIMS-KMA, dataset: KACE-1-0-G, ensemble: r1i1p1f1, grid: gr, esgf_version: v20190911} + - {institute: NOAA-GFDL, dataset: GFDL-CM4, ensemble: r1i1p1f1, grid: gr1, esgf_version: v20180701} +# - {institute: NOAA-GFDL, dataset: GFDL-CM4, ensemble: r1i1p1f1, grid: gr2, esgf_version: v20180701} + - {institute: NOAA-GFDL, dataset: GFDL-ESM4, ensemble: r1i1p1f1, grid: gr1, esgf_version: v20190726} + - {institute: NUIST, dataset: NESM3, ensemble: r1i1p1f1, grid: gn, esgf_version: v20190812} + - {institute: SNU, dataset: SAM0-UNICON, ensemble: r1i1p1f1, grid: gn, esgf_version: v20190323} + +datasets_ERA5: &datasets_era5 + - {project: native6, dataset: ERA5, type: reanaly, frequency: day, latestversion: v1, mip: day, tier: 3, start_year: 1958, end_year: 2014} + +datasets_EOBS: &datasets_EOBS + - {dataset: E-OBS, project: OBS, version: v29.0e-0.25, type: ground, tier: 2, start_year: 1958, end_year: 2014, mip: day} + +preprocessors: + weathertype_preproc: + extract_time: &time + start_year: 1950 + start_month: 1 + start_day: 1 + end_year: 2014 + end_month: 12 + end_day: 31 + regrid: + # This is the region for which the Lamb weathertypes will be calculated. + # The objective classification scheme of Jones et al. 1993 is used, + # these grid points are the same ones as used in the paper. + target_grid: &grid + start_longitude: -5 + end_longitude: 25 + step_longitude: 5 + start_latitude: 35 + end_latitude: 55 + step_latitude: 5 + scheme: linear + + weathertype_preproc_pr: + extract_time: + *time + extract_region: + # This is the region for which precipitation data can be used to + # calculate correlations between weathertypes and this said region + # to further be able to combine them into 'simplified' weathertypes. + # Combined weathertypes correspond to a specific precipitation + # pattern over this region. You can change this to a region of your liking + # to see which types of weathertypes are associated with which precipitation + # patterns. + start_longitude: 9.5 + end_longitude: 17.25 + start_latitude: 46.25 + end_latitude: 49 + + mean_preproc: + extract_time: + *time + extract_region: ® + start_longitude: -15 + end_longitude: 35 + start_latitude: 25 + end_latitude: 65 + +diagnostics: + weathertyping: + description: calculate weathertypes and plot means + variables: + era5_msl_wt: &era5_msl_wt + project: native6 + dataset: ERA51 + type: reanaly + frequency: day + mip: day + latestversion: v1 + tier: 3 + start_year: 1958 + end_year: 2014 + short_name: psl + preprocessor: weathertype_preproc + additional_datasets: + *datasets_era5 + + era5_pr_wt: &era5_pr_wt + project: native6 + dataset: ERA51 + type: reanaly + frequency: day + mip: day + latestversion: v1 + tier: 3 + start_year: 1958 + end_year: 2014 + short_name: pr + preprocessor: weathertype_preproc_pr + additional_datasets: + *datasets_era5 + + eobs_pr_wt: &eobs_pr_wt + project: OBS + dataset: E-OBS + type: ground + mip: day + version: v29.0e-0.25 + tier: 2 + start_year: 1958 + end_year: 2014 + short_name: pr + preprocessor: weathertype_preproc_pr + additional_datasets: + *datasets_EOBS + + era5_msl_mean: &era5_msl + project: native6 + dataset: ERA5 + type: reanaly + frequency: day + mip: day + latestversion: v1 + tier: 3 + start_year: 1958 + end_year: 2014 + short_name: psl + preprocessor: mean_preproc + additional_datasets: + *datasets_era5 + + era5_tp_mean: &era5_prcp + project: native6 + dataset: ERA5 + type: reanaly + frequency: day + mip: day + latestversion: v1 + tier: 3 + start_year: 1958 + end_year: 2014 + short_name: pr + preprocessor: mean_preproc + additional_datasets: + *datasets_era5 + + era5_tas_mean: &era5_temp + project: native6 + dataset: ERA5 + type: reanaly + frequency: day + mip: day + latestversion: v1 + tier: 3 + start_year: 1958 + end_year: 2014 + short_name: tas + preprocessor: mean_preproc + additional_datasets: + *datasets_era5 + + cmip6_historical_psl_day_wt: &cmip6_historical_wt + project: CMIP6 + activity: CMIP + exp: historical + mip: day + short_name: psl + preprocessor: weathertype_preproc + start_year: 1950 + end_year: 2014 + additional_datasets: + *datasets_psl_cmip6 + + cmip6_historical_psl_day_mean: &cmip6_historical_psl + project: CMIP6 + activity: CMIP + exp: historical + mip: day + short_name: psl + preprocessor: mean_preproc + start_year: 1950 + end_year: 2014 + additional_datasets: + *datasets_psl_cmip6 + + cmip6_historical_prcp_day_mean: &cmip6_historical_prcp + project: CMIP6 + activity: CMIP + exp: historical + mip: day + short_name: pr + preprocessor: mean_preproc + start_year: 1950 + end_year: 2014 + additional_datasets: + *datasets_psl_cmip6 + + cmip6_historical_temp_day_mean: &cmip6_historical_ta + project: CMIP6 + activity: CMIP + exp: historical + mip: day + short_name: tas + preprocessor: mean_preproc + start_year: 1950 + end_year: 2014 + additional_datasets: + *datasets_psl_cmip6 + + scripts: + weathertyping: + correlation_threshold: 0.9 + rmse_threshold: 0.002 + automatic_slwt: true + predefined_slwt: {1: [1, 2, 19], 2: [3, 4, 22, 21], 3: [5, 6, 15, 16], 4: [7, 8, 11, 18], 5: [9, 17], 6: [10, 20], 7: [12, 13, 14], 8: [23, 24], 9: [25, 26], 0: [27]} + plotting: true + script: weathertyping/weathertyping.py diff --git a/esmvaltool/references/jones93ijc.bibtex b/esmvaltool/references/jones93ijc.bibtex new file mode 100644 index 0000000000..f7174621c0 --- /dev/null +++ b/esmvaltool/references/jones93ijc.bibtex @@ -0,0 +1,15 @@ +@article{https://doi.org/10.1002/joc.3370130606, + author = {Jones, P. D. and Hulme, M. and Briffa, K. R.}, + title = {A comparison of Lamb circulation types with an objective classification scheme}, + journal = {International Journal of Climatology}, + volume = {13}, + number = {6}, + pages = {655-663}, + keywords = {Weather types, Temperature, Precipitation, British, Isles}, + doi = {https://doi.org/10.1002/joc.3370130606}, + url = {https://rmets.onlinelibrary.wiley.com/doi/abs/10.1002/joc.3370130606}, + eprint = {https://rmets.onlinelibrary.wiley.com/doi/pdf/10.1002/joc.3370130606}, + abstract = {Abstract An objective scheme, initially developed by Jenkinson and Collison, is used to classify daily circulation types over the British Isles, along the lines of the subjective method devised by Lamb. The scheme uses daily grid-point mean sea-level pressure data for the region. The results of the analysis over the period 1881–1989 are compared with ‘true’ Lamb weather types. The frequencies of objectively developed types are highly correlated with traditional Lamb types, especially so for synoptic (cyclonic and anticyclonic) types, although still good for wind directional types. Comparison of the two classification schemes reveals negligible differences between the correlations of the counts of circulation types and regional temperature and rainfall. The major difference between the two classification schemes is that the decline of the westerlies since 1940 is less evident with the objective scheme.}, + year = {1993} +} + diff --git a/esmvaltool/references/maraun21jgr.bibtex b/esmvaltool/references/maraun21jgr.bibtex new file mode 100644 index 0000000000..b0bac79374 --- /dev/null +++ b/esmvaltool/references/maraun21jgr.bibtex @@ -0,0 +1,16 @@ +@article{https://doi.org/10.1029/2020JD032824, + author = {Maraun, Douglas and Truhetz, Heimo and Schaffer, Armin}, + title = {Regional Climate Model Biases, Their Dependence on Synoptic Circulation Biases and the Potential for Bias Adjustment: A Process-Oriented Evaluation of the Austrian Regional Climate Projections}, + journal = {Journal of Geophysical Research: Atmospheres}, + volume = {126}, + number = {6}, + pages = {e2020JD032824}, + keywords = {Austria, bias adjustment, climate model evaluation, large-scale circulation errors, regional climate projections}, + doi = {https://doi.org/10.1029/2020JD032824}, + url = {https://agupubs.onlinelibrary.wiley.com/doi/abs/10.1029/2020JD032824}, + eprint = {https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1029/2020JD032824}, + note = {e2020JD032824 2020JD032824}, + abstract = {Abstract The Austrian regional climate projections are based on an ensemble of bias adjusted regional climate model simulations. Bias adjustment (BA) improves the usability of climate projections for impact studies, but cannot mitigate fundamental model errors. This argument holds in particular for biases in the temporal dependence, which is strongly influenced by the large-scale circulation. Global climate models (GCMs), underlying regional climate projections, suffer from substantial circulation errors. We therefore, conduct a process-based evaluation of the Austrian climate projections focusing on large-scale circulation errors, their regional imprints and the potential for BA. First, we define nine synoptic weather types and assess how well the considered climate models represent their occurrence and persistence. Second, we assess biases in the overall dry and hot day probabilities, as well as conditional on synoptic weather type occurrence; and biases in the duration of dry and hot spells. Third, we investigate how these biases depend on biases in the occurrence and persistence of relevant synoptic weather types. And fourth, we study how much an overall BA improves these biases. Many GCMs misrepresent the occurrence and persistence of relevant synoptic weather types. These biases have a clear imprint on biases in dry and hot day occurrence and spell durations. BA in many cases helps to greatly reduce biases even in the presence of circulation biases, but may in some cases amplify conditional biases. Persistence biases are especially relevant for the representation of meteorological drought. Biases in the duration of dry spells cannot fully be mitigated by BA.}, + year = {2021} +} +