Introduce `run.clone()` #141

glatterf42 · 2024-11-29T13:46:35Z

Closes #133 :)
This is the next part of cleaning up #101, but it also improves the code that was originally introduced there, though I'm not entirely happy with all changes.

The goal was to enable run.clone() in the core layer. For that, clone() had to move outside of the core RunRepository since Run is not related to that. The most logical place to put it seemed backend.runs.clone() to me, so I did that, but it also warrants adding clone() to the REST API.
With clone(), we're creating a new run, which sounds like a POST request to me. However, we already intercept all those requests to /runs/, so I had to find a new way. Rather than going for another request method, I opted to redirect all POST requests to /runs/clone/, which is not otherwise in use, to clone() in the DB layer. Please let me know if this is appropriate.
The actual logic for clone() worked well in the core layer, where it could make use of the run.iamc object encapsulating some validation and formatting operations. I tried to move these helper functions to the DB layer, but some require access to a specific run, making them unsuitable to be used in the DB RunRepository. Thus, I duplicated their logic within the clone() function, which is likely not the best solution. Please let me know which alternative we can employ.
Lastly, I found two oddities referring to type hints, both marked as TODO inline:

core.indexset.data returns a list[float] | list[int] | list[str] now, which is technically correct: there can only be one type in any indexset.data list. However, type checkers then don't know if they should allow something like indexset.add("foo") since they can't guarantee that .data is a list[str] this time. I'm not sure how, but it would be great to annotate .data ... dynamically I guess, so that it's recognized correctly and specifically.
Some iamc functions like iamc.datapoints.bulk_upsert() are annotated inconsistently. In the abstract layer, they may proclaim to accept a pd.DataFrame, whereas in the DB layer, they require a pandera.DataFrame of a specific schema. This leads to some type: ignore that we could probably remove, but I'm not sure how we would want to do that.
We could of course adapt the type hint to accept both kinds of dataframes, but we probably want to ensure that the data are validated even when the DB layer is called directly, so we probably don't want to let simple pd.DataFrames slip. Just using pandera.DataFrames might be tricky, too, as e.g. pandera.DataFrame["run__id"] tells me I can't index a pandera.DataFrame. I'm happy to hear suggestions here :)

codecov · 2024-11-29T13:51:14Z

Codecov Report

Attention: Patch coverage is 90.90909% with 10 lines in your changes missing coverage. Please review.

Project coverage is 87.1%. Comparing base (8513cac) to head (d69ff7c).

Files with missing lines	Patch %	Lines
ixmp4/data/db/iamc/utils.py	82.0%	7 Missing ⚠️
ixmp4/data/abstract/run.py	50.0%	1 Missing ⚠️
ixmp4/data/db/iamc/timeseries/repository.py	50.0%	1 Missing ⚠️
ixmp4/data/db/timeseries.py	50.0%	1 Missing ⚠️

Additional details and impacted files

@@                Coverage Diff                @@
##           enh/run-get-by-id    #141   +/-   ##
=================================================
  Coverage               87.0%   87.1%           
=================================================
  Files                    230     231    +1     
  Lines                   8170    8216   +46     
=================================================
+ Hits                    7113    7160   +47     
+ Misses                  1057    1056    -1

Files with missing lines	Coverage Δ
ixmp4/core/iamc/data.py	`100.0% <100.0%> (+8.4%)`	⬆️
ixmp4/core/optimization/indexset.py	`91.5% <100.0%> (ø)`
ixmp4/core/run.py	`98.0% <100.0%> (ø)`
ixmp4/data/api/base.py	`88.4% <100.0%> (ø)`
ixmp4/data/api/run.py	`98.0% <100.0%> (+0.1%)`	⬆️
ixmp4/data/db/meta/repository.py	`96.2% <100.0%> (-0.1%)`	⬇️
ixmp4/data/db/model/repository.py	`100.0% <100.0%> (ø)`
ixmp4/data/db/run/repository.py	`95.8% <100.0%> (+1.3%)`	⬆️
ixmp4/data/types.py	`100.0% <100.0%> (ø)`
ixmp4/server/rest/run.py	`100.0% <100.0%> (ø)`
... and 4 more

meksor · 2025-01-07T11:16:18Z

I would not add an endpoint for this, especially as it does not actually add any functionality (by definition a run that is cloned could already be created via the REST API) and I hope that we wont be cloning thousands of runs on a regular basis.
The tests are very hard to read again, can we do something to make them easier to parse?

meksor · 2025-01-07T11:41:33Z

Just to clarify: Ill accept the PR also with the current idea, just still reading through it...

danielhuppmann · 2025-01-07T11:51:18Z

Two quick responses to @meksor:

I would not add an endpoint for this, especially as it does not actually add any functionality (by definition a run that is cloned could already be created via the REST API)

The benefit of a dedicated endpoint would be that the data would be copied within/close to the database, right? Avoiding that the data is sent back and forth via the RestAPI.

Also, having a dedicated clone() method avoids forgetting some parts of the Run (iamc-data, optimization-items, other stuff to be added in the future)

I hope that we wont be cloning thousands of runs on a regular basis.

Sorry, cloning runs happens all the time in MESSAGE modelling - every time a user does a variation of an existing scenario, you start by cloning the base scenario and make some modifications.

meksor · 2025-01-07T12:56:49Z

OK, convinced!

meksor

Hi, as said, i think the two tests added here could benefit from comments/whitespace/abstraction to make them easier to read.
A hard requirement is a test that checks if clone respects the permission check!
(It looks like it does since it uses create, but its best to bake this into a test.)
You can add a test in test_auth.py

glatterf42 · 2025-01-08T08:09:46Z

Thanks for your review :)
I'm not familiar with all the intricacies of the auth system, so please let me know if the test I added is already sufficient. Without the if platform_info == self.gated: clause, run.clone() fails on the public and private platforms due to missing permissions.

As for the readability: I tried adding some blank lines and comments, but I'm actually just adding one test here (not counting the auth test, which does not require the same setup), so I'm not sure how to abstract things away here short of a refactoring regarding how we call the methods of adding data to optimization items or listing them. Please let me know what you think.

meksor · 2025-01-08T13:12:51Z

tests/test_auth.py

+                assert_cloned_run(run, clone_with_solution, kept_solution=True)
+            else:
+                with pytest.raises(Forbidden):
+                    _ = run.clone()


You should be able to just add to the test_filters and test_guards tests above

Ah, that should be done now :)

glatterf42 added the enhancement New feature or request label Nov 29, 2024

glatterf42 self-assigned this Nov 29, 2024

glatterf42 requested review from meksor and danielhuppmann December 20, 2024 07:40

glatterf42 force-pushed the enh/run-get-by-id branch from 2e9cb4c to 8513cac Compare December 20, 2024 13:16

glatterf42 added 9 commits December 20, 2024 14:17

Remove superfluous casts

321e81a

Remove superfluous lines

3cb5f39

Make return value of indexset.data consistent

54eb700

Enable all abstract EnumerateKwargs for DB-timeseries

335388b

Make iamc.data helper functions available in DB layer

39e1d19

Introduce run.clone()

9e2dab4

Update openapi schema

d517108

Remove outdated comment

2b85a87

Name helper file appropriately

0975ce3

glatterf42 force-pushed the enh/run-clone branch from 73b59c6 to 0975ce3 Compare December 20, 2024 13:22

meksor requested changes Jan 7, 2025

View reviewed changes

meksor reviewed Jan 8, 2025

View reviewed changes

glatterf42 added 2 commits January 8, 2025 15:24

Add test that run.clone uses auth system

02694e6

Enhance readability of run.clone test

d69ff7c

glatterf42 force-pushed the enh/run-clone branch from 8c584c2 to d69ff7c Compare January 8, 2025 14:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce `run.clone()` #141

Introduce `run.clone()` #141

glatterf42 commented Nov 29, 2024 •

edited

Loading

codecov bot commented Nov 29, 2024 •

edited

Loading

meksor commented Jan 7, 2025

meksor commented Jan 7, 2025

danielhuppmann commented Jan 7, 2025

meksor commented Jan 7, 2025

meksor left a comment

glatterf42 commented Jan 8, 2025

meksor Jan 8, 2025

glatterf42 Jan 8, 2025

Introduce run.clone() #141

Are you sure you want to change the base?

Introduce run.clone() #141

Conversation

glatterf42 commented Nov 29, 2024 • edited Loading

codecov bot commented Nov 29, 2024 • edited Loading

Codecov Report

meksor commented Jan 7, 2025

meksor commented Jan 7, 2025

danielhuppmann commented Jan 7, 2025

meksor commented Jan 7, 2025

meksor left a comment

Choose a reason for hiding this comment

glatterf42 commented Jan 8, 2025

meksor Jan 8, 2025

Choose a reason for hiding this comment

glatterf42 Jan 8, 2025

Choose a reason for hiding this comment

Introduce `run.clone()` #141

Introduce `run.clone()` #141

glatterf42 commented Nov 29, 2024 •

edited

Loading

codecov bot commented Nov 29, 2024 •

edited

Loading