Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in clear_with_dataframe() when using multiple hierarchies in dimension_mapping #1068

Open
151N3 opened this issue Mar 8, 2024 · 4 comments
Labels

Comments

@151N3
Copy link
Contributor

151N3 commented Mar 8, 2024

Describe the bug
There is a bug in the Iteration over the hierarchies here.
We initialize the mdx_selections dict and while iterating over the hierarchies we override as we can have just one key for the dimension name in the dict mdx_selections


dimension_mapping = {"mydimension" : ["hierarchie_1", "hierarchie_2"]


 mdx_selections[dimension] = MdxHierarchySet.tm1_subset_all(
                            dimension=dimension,
                            hierarchy=hierarchy).filter_by_level(0)
	#mdx_selections[dimension].to_mdx() = "{TM1FILTERBYLEVEL({TM1SUBSETALL([mydimension].[hierarchie_1])},0)}"
	#mdx_selections[dimension].to_mdx() = "{TM1FILTERBYLEVEL({TM1SUBSETALL([mydimension].[hierarchie_2])},0)}"

We get a final mdx something like this: mdx_builder.to_mdx() =

SELECT
NON EMPTY 
*{[year].[year].[2024]} 
*{[month].[month].[04]} 
*{TM1FILTERBYLEVEL({TM1SUBSETALL([mydimension].[hierarchie_2])},0)} 
DIMENSION PROPERTIES MEMBER_NAME ON 0
FROM [mycube]

Expected behavior
We would except this mdx result:

mdx_builder.to_mdx() =

SELECT
NON EMPTY 
*{[year].[year].[2024]} 
*{[month].[month].[04]} 
* {TM1FILTERBYLEVEL({TM1SUBSETALL([mydimension].[hierarchie_1])},0)}
*{TM1FILTERBYLEVEL({TM1SUBSETALL([mydimension].[hierarchie_2])},0)} 
DIMENSION PROPERTIES MEMBER_NAME ON 0
FROM [mycube]

Would be nice to have some more logging, like print the final mdx statement so one can see the what's going on.

There is another minimal Bug: L692
the variable dimension is not assigned yet and should write column instead:

raise ValueError(f"Value for key '{column}' in dimension_mapping must be of type str")

ToDo: We should check if the mapping-hierarchies match with the ones from the tm1 dimension similar with the unmatched_dimension_names

Version
TM1py 2.0.0
TM1 Server Version: 11.8

Additional context
Add any other context about the problem here.

@151N3 151N3 added the bug label Mar 8, 2024
@151N3
Copy link
Contributor Author

151N3 commented Mar 8, 2024

My Solution:

            elif isinstance(hierarchies, Iterable):
                for hierarchy in hierarchies:
                    if dimension not in mdx_selections:
                        mdx_selections[dimension] = MdxHierarchySet.tm1_subset_all(
                            dimension=dimension,
                            hierarchy=hierarchy).filter_by_level(0)
                    else:
                        mdx_selections[hierarchy] = MdxHierarchySet.tm1_subset_all(
                            dimension=dimension,
                            hierarchy=hierarchy).filter_by_level(0)

or just

            elif isinstance(hierarchies, Iterable):
                for hierarchy in hierarchies:
                       dimension_hierarchy = f"{dimension}_{hierarchy}"
                        mdx_selections[dimension_hierarchy] = MdxHierarchySet.tm1_subset_all(
                            dimension=dimension,
                            hierarchy=hierarchy).filter_by_level(0)

As we are not using the keys of the dictionary mdx_selections for anything. What do you think @MariusWirtz

@MariusWirtz
Copy link
Collaborator

If I understand you right, I think your proposed solution is to clear the cross join of both hierarchies.
I think that's not ideal. If someone provides multiple hierarchies to a dimension, they may want to clear both hierarchies instead of clearing the crossjoin of both hierarchies.

Here is simple example:
Cube c1 with two dimensions d1 and d2.
d2 has 2 hierarchies: h3 and h4.

There is leaf level data on

  • d1 x d2:h3
  • d1 x d2:h4
image image

When looking at the crossjoin of d1 x h3 x h4 there is no data.

image

I suggest when multiple hierarchies are passed in the dimension_mapping, we do the clear in separate operations.
What do you think?

So we would clear d1 x d2:h3 and d1 x d2:h4 instead of clearing d1 x h3 x h4

@151N3
Copy link
Contributor Author

151N3 commented Apr 8, 2024

Hey @MariusWirtz,
that's a valid point. I didn't think about the cross join. In that case, you are right and it's not a bug but a feature :-)

There two other things a still worth to consider?

There is another minimal Bug: L692
the variable dimension is not assigned yet and should write column instead:

raise ValueError(f"Value for key '{column}' in dimension_mapping must be of type str")

ToDo: We should check if the mapping-hierarchies match with the ones from the tm1 dimension similar with the unmatched_dimension_names

@151N3
Copy link
Contributor Author

151N3 commented Dec 4, 2024

Hey @MariusWirtz,

we had some issues with the function lately as a cross join in the MDX statement occurred as we wanted to delete multiple periods from two different years, here is relevant part of the generated MDX:

SELECT
NON EMPTY
* {[year].[year].[2023],[year].[year].[2024]} 
* {[month].[month].[m01],[month].[month].[m02],[month].[month].[m03],[month].[month].[m04],[month].[month].[m05],[month].[month].[m09],[month].[month].[m10]} 
 DIMENSION PROPERTIES MEMBER_NAME ON 0
FROM [mycube]

My workaround was to just loop though the dataframe row by row and execute clear_with_dataframe:

        for i in range(len(df)):
            row_df = df.iloc[[i]]
            logger.info(f"Clearing slices/periods {df.to_dict('records')}\n "
                        f"from cube: {cube_name} with mapping: {dimension_mapping}")
            tm1.cells.clear_with_dataframe(cube=cube_name, df=row_df, dimension_mapping=dimension_mapping)

I think it would be worth to integrate this kind of logic in the clear_with_dataframe function and not just for the dataframe but eventually for initial problem with the dimension_mapping as well:

SELECT
NON EMPTY 
*{[year].[year].[2024]} 
*{[month].[month].[04]} 
* {TM1FILTERBYLEVEL({TM1SUBSETALL([mydimension].[hierarchie_1])},0)}
*{TM1FILTERBYLEVEL({TM1SUBSETALL([mydimension].[hierarchie_2])},0)} 
DIMENSION PROPERTIES MEMBER_NAME ON 0
FROM [mycube]

It means that we have to execute the clear_with_mdx function multiple times in a row, but we avoid the cross join issue. What do you think?

PS: It was hard debugging the data loss, so maybe we should name the process in L755?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants