Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allowing to summarise columns not by a factor #9232

Merged
merged 10 commits into from
Nov 8, 2024

Conversation

lilyclements
Copy link
Contributor

@lilyclements lilyclements commented Nov 5, 2024

Fixes #5482
I've also fixed a typo in the error codes I noticed
And this fixes one of the issues discussed in #9238, reported by @Vitalis95 in summary_cor.

@rdstern this should fix the factor issue that Lisa reported. This is ready to test/review

@lilyclements lilyclements changed the title adding if statement to param list Allowing to summarise columns not by a factor Nov 5, 2024
Copy link
Collaborator

@rdstern rdstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lilyclements many thanks for trying to fix this. Both Lisa and @dannyparsons said it used to work.
With your fix I don't get the same error, so it clearly helps. However:
a) I suggest it only gives a single value in the new summary sheet.
b) If you ask for (say) mean, min, max, count, it still just gives the first value.
c) It works when you put the results into the output window.

With your fix (I had missing values in my data), I just got a single NA in the results window. When I asked for the missing in the data to be ignored it made no difference.
So I then filtered out the missing values and it now gives an error.

So, I think there is either a bit more to do, or we (at least temporarily) just try to return to an adaptation of what worked previously. I need to check more and I'll then continue.

Yes, I think I now have a reasonable suggestion, and it is almost no work. Except, see below!

As Danny explained, one problem with no factors is that the linking of the data frames can't work. So we can't add to the new (one-row summary) data frame, ad we can when there are factors.

Now it does work, sending the results to the Output window. And I suggest that is a much more reasonable place for those summaries. So I propose that, if no factors, then it puts the results there, rather than into a one-row data frame. Neat eh?

Note that pressing the results to the output window is not enough. We must also NOT allow the results to be saved into a data frame. I think now, it sometimes is also writing more repeated variables back to the originating data frame.

Ideally we might tidy the layout of the output, but that could come later.

@N-thony
Copy link
Collaborator

N-thony commented Nov 6, 2024

@rdstern, I recommend that once this issue is resolved and the application is running smoothly with the current R version, we proceed with testing all dialogues using the calculation system before I make a new release again.

@rdstern
Copy link
Collaborator

rdstern commented Nov 6, 2024

@N-thony if @lilyclements agrees with my suggestion below, then I suggest you could quickly do this. It is "front-end" so I think much easier for you to do than Lily.
When there are no factors, I suggest that automatically the results only go to the Output window and not to a new data frame, or to add anything to the calling data frame.

Here is an example of the dialog: Note nothing in the Factors control on the right:

image

That is the situation where Lily has fixed a bug. I am suggesting that what you do is the same as you could achieve by unticking the Store Results and Ticking the Store results in Output window controlm i.e. if the dialog looked like this:

image

I am not suggesting you change these controls on the dialog, but that's what happens when Ok is pressed, if there are no factors.

@lilyclements
Copy link
Contributor Author

@rdstern is your (a), (b), (c) an issue you are currently getting, or something you are proposing (I assume the latter). In which case, this isn't R code bits and is front end stuff as you say.

I am not sure about removing the option to have it stored as a data frame if there are no factors. If I have all the R code up on the right hand side, or output, it can be difficult to flick through it and find that value I calculated before. Instead, I quite like being able to find it in a data frame. However, I am not our expected user, and if it is more confusing to have it show up this way then I understand.

@rdstern
Copy link
Collaborator

rdstern commented Nov 6, 2024

@lilyclements my a) , b) c) is what currently happens and also was the case in Version 0.7.6, where it was claimed to work.
I am also quite keen to propose that the solution of outputting, from this dialog only to the Output window, when there are no factors, is now (for us) a pretty good practice to recommend. That's because of the following:
a) If you want a one-line summary in its own new data frame, then it is quite easy to do. Just make a factor column with a single level and use that in the Summary dialog. The factor (with just one level) is not just for the heck of it. It is also the linking variable. So further summaries add to the same data frame.
b) If we can fix the one line situation without a factor, then it won't be linked. So not quite so good.
c) With just one or two summaries you probably want to look at them anyway - nice in the output window.
d) Now the final (new) important point - the sort of good practice stuff. In our "calculation system" we can produce 3 "levels" of summaries, or transformations. We can produce scalars, or columns or whole data frames.

We have recently added scalars to our datafrmae metadata. So I suggest that, if Lisa wants the overall mean to be able to use in further calculations then it is a scalar result from the calculator. So, instead of using the Summarise dialog and producing a whole dataframe for a single number, she just adds a scalar to the yearly dataframe as below. Then it is easily available for further calculations.

image

Nice, now?

@lilyclements
Copy link
Contributor Author

@rdstern great. Are you happy with the code changes in here for someone else to take on the front-end changes?

@rdstern
Copy link
Collaborator

rdstern commented Nov 6, 2024

@N-thony I wonder if you could do this quickly? It would then be a fix for Lisa's bug.
So from the code, as now,
If ok is pressed and there are no factors:
a) The output is is to the output window - so as though the Print Results to output window was ticked. (Whether or not it was ticked!)
b) The results are not stored in another data frame - so as though the Store Results was unticked. (Whether, or not it was unticked!)

I hope that will be easy?

@N-thony
Copy link
Collaborator

N-thony commented Nov 7, 2024

@N-thony I wonder if you could do this quickly? It would then be a fix for Lisa's bug. So from the code, as now, If ok is pressed and there are no factors: a) The output is is to the output window - so as though the Print Results to output window was ticked. (Whether or not it was ticked!) b) The results are not stored in another data frame - so as though the Store Results was unticked. (Whether, or not it was unticked!)

I hope that will be easy?

@rdstern I have made the changes, have a look.

Copy link
Collaborator

@rdstern rdstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@N-thony it is almost there - working too well really
a) When no factor it seems fine. Great.
b) But when I add a factor, then it still goes to the output window. Ehen there is a factor, then it should obay the dialog.

image

@N-thony
Copy link
Collaborator

N-thony commented Nov 7, 2024

@rdstern have a look

@N-thony
Copy link
Collaborator

N-thony commented Nov 7, 2024

@rdstern have a look

@rdstern is this okay now?

Copy link
Collaborator

@rdstern rdstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@N-thony that's better, but now it is in both the new data frame (good) and still the output window (bad).

image

@N-thony
Copy link
Collaborator

N-thony commented Nov 7, 2024

@N-thony that's better, but now it is in both the new data frame (good) and still the output window (bad).

image

@rdstern maybe I missed something about this dialogue. Isn't it because Print Results to Output Window is checked?
What should happen if store and print are both checked?

@rdstern
Copy link
Collaborator

rdstern commented Nov 7, 2024

@N-thony and that's just in your branch I think. It should not be checked by default. It isn't in Version 0.8,0, but it is in your branch. So please change that.

That's a relief that you found it!

Copy link
Collaborator

@rdstern rdstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@N-thony perfect. I'd really like to get this merged in the forthcoming release.
@Patowhiz could you please check?

@N-thony N-thony merged commit 7ffe0c4 into IDEMSInternational:master Nov 8, 2024
2 checks passed
@N-thony
Copy link
Collaborator

N-thony commented Nov 12, 2024

@N-thony I wonder if you could do this quickly? It would then be a fix for Lisa's bug. So from the code, as now, If ok is pressed and there are no factors: a) The output is is to the output window - so as though the Print Results to output window was ticked. (Whether or not it was ticked!) b) The results are not stored in another data frame - so as though the Store Results was unticked. (Whether, or not it was unticked!)

I hope that will be easy?

@Patowhiz this was done but we had an issue when clicking on Reset, it is now corrected in PR #9244

@rdstern
Copy link
Collaborator

rdstern commented Nov 23, 2024

@Patowhiz please could you check now, and merge if ok. This is an important change and I thought it was in 0.8.0. I'd really like it in the version this week.

Thanks

@Patowhiz
Copy link
Contributor

Hi @rdstern, I believe this was merged about two weeks ago. However, it seems the changes might have been reverted or undone. Confirm with @N-thony?

@rdstern
Copy link
Collaborator

rdstern commented Nov 23, 2024

@N-thony and @Patowhiz my mistake, sorry. It was merged 2 weeks ago as Patrick says.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Column summaries output display for no by factors
4 participants