-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate legacy reporting #159
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #159 +/- ##
=======================================
- Coverage 75.7% 74.6% -1.1%
=======================================
Files 93 94 +1
Lines 6379 6428 +49
=======================================
- Hits 4831 4800 -31
- Misses 1548 1628 +80 |
Thanks for getting this started! Quick thoughts on only a few items:
I am undecided on whether we should also exclude them from coverage measurement. The legacy reporting is very big; in terms of line count it outweighs the entire existing message-ix-models code base. This combined with the low (previously: zero) test coverage would mean a big hit for the coverage of the package overall.
In theory, the unpacking of the data goes parameter-by-parameter from the Excel file into the Gzipped CSVs, and shouldn't hold more than one in memory at a time. We could try forcing the backend to clear its cache after each one. Also note that the repo contains a pre-unpacked version of snapshot-0, which saves time on CI. I think this might get unwieldy if we do this for every snapshot, but perhaps could do that in this case. If this gets to be a big set of tasks, we could do it in its own PR and then rebase this one.
😬 Good catch, 100% we should change this.
Agreed, 1 and no more for this initial PR. These can later be expanded as (internal) users need to ensure stable behaviour for other combinations of settings (though, again, we should prefer to expand the features of the genno-based reporting to provide what is needed). |
The tests are taking so long that I'll probably temporarily disable all but one version, too. And we might want to consider moving these tests to some kind of nightly workflow instead of testing on every push to every PR. |
Indeed, see #139 |
e65b427
to
3f87d74
Compare
On Either way, the snapshot-0 data still contain "Equation and variable data" that snapshot-1 is still missing. They were added through this commit, which describes the command with which they were created. |
This PR is going to create a headache because we are again including a Unfortunately, this duplicate name is not what's causing the current test failure. Testing locally, the check |
376f880
to
489e179
Compare
Regarding test coverage, I waited with a decision to see how large the final drop would be. Even with the existing test (which is a single call to the legacy |
The main issue about reduced test coverage seems to be that the functions in |
Towards getting this merged, can we please reword or drop the following commits?
I will meanwhile go through and leave some comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At some point, we may want to write tests that refer to the output of the legacy reporting—for instance, to check improvements to the genno-based reporting.
There would be different ways to do that:
- On one's own machine, run the legacy reporting on snapshot-1 (either directly or via the tests) and either:
- Pick up the generated file.
- Export the time series data stored on the Scenario.
- As part of this workflow, add actions/upload-artifact to scoop up the generated file.
I don't think we need to do either for this PR, just making a note of our options
b1d58ee
to
d8a6fa9
Compare
7abdff4
to
2ad3837
Compare
2ad3837
to
2a0e7e6
Compare
This PR takes the current version of the "legacy reporting" from message_data's ssp_dev branch and plugs it into
message_ix_models/report/legacy
.No changes were made to the code in order to preserve comparability to other versions of the legacy reporting that might exist in unknown locations. This is why the files in
message_ix_models/report/legacy
had to be specifically excluded from code improvement tools such as mypy and ruff.Since we want to confirm that the legacy reporting runs with the latest version of the global model baseline snapshot on zenodo, I added information about this snapshot to
SOURCES
inutil/pooch
. Some tests apart fromtests/report/test_legacy::test_legacy_report
might not work due to this change. In particular, ss_reporter and unpack_snapshot_data seem to only handlesnapshot-0
for now.On my local system, running
test_load
almost crashed my computer trying to unpack allsnapshot-1
data. It resulted in aJava heap space OutOfMemoryError
.On a tangent here,
ss_reporter
is a name where I'd vividly wish for a change, e.g. tosimulated_solution_reporter
. All these shortened names are bad in themselves, but even more so when this creates references to literal nazi organizations.I didn't dare running
test_legacy_report
. I'm not sure the current setup is the best withload_snapshots
a session-scoped fixture parametrized for the snapshots and yielding scenarios with them, but I'm not sure what else to do.test_legacy_report
definitely still needs to be adapted to cover all function parameters/options, if that's our goal. We should at least find one working combination of those and bake it into the tests, I think.Further TODOs that can be expected:
legacy_report()
can handle the new unitsBefore starting this PR, I started migrating the legacy reporting while cleaning it up at the same time. This is nowhere near completion, but the point I got to can be seen at https://github.com/iiasa/message-ix-models/tree/modernize/legacy-reporting.
How to review
TBA
PR checklist