Remove `technology_type` column from `tools.costs` #269

measrainsey · 2024-12-12T11:03:19Z

Small PR to remove the technology_type column from the costs tool

This PR should close:

Cost reduction CSV has unnecessary technology_type column #260

This column was carried over from the legacy costs calculation work. While I think it may be helpful to understand how technologies are grouped, this column doesn't really serve any purpose in the code base, as the only mentions of it in the code is it being deleted.

Additionally, when others have been adding technologies to the different modules, there has been confusion about what they should put in this column (and it also becomes added work/input).

So I've decided to remove the usage of this column from the costs tool. Shouldn't affect the tool really in any way.

But might cause some merge errors in #235.

How to review

For @khaeru and/or @glatterf42 : Read the diff and note that the CI checks all pass.

PR checklist

Continuous integration checks all ✅
~~Add or expand tests; coverage checks both~~ ✅
~~Add, expand, or update documentation.~~
Update doc/whatsnew.

codecov · 2024-12-12T11:15:05Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 76.6%. Comparing base (44087e4) to head (59c4ace).

Additional details and impacted files

@@           Coverage Diff           @@
##            main    #269     +/-   ##
=======================================
- Coverage   77.6%   76.6%   -1.0%     
=======================================
  Files        211     211             
  Lines      16079   16079             
=======================================
- Hits       12481   12332    -149     
- Misses      3598    3747    +149

Files with missing lines	Coverage Δ
message_ix_models/tools/costs/decay.py	`100.0% <100.0%> (ø)`

... and 8 files with indirect coverage changes

glatterf42

Thanks, this looks good to me. One suggestion in-line, but no need to pick it up if you don't want to.

glatterf42 · 2024-12-12T12:26:14Z

message_ix_models/tools/costs/decay.py

-                .reset_index(drop=True)
-                .drop(columns=["technology_type"])
+            reduction_joined = reduction_energy._append(reduction_module).reset_index(
+                drop=True
            )
        else:
            reduction_joined = reduction_energy.copy()


In such cases, I prefer this kind of syntax:

reduction_joined = reduction_energy.copy() if module == "energy" else reduction_energy._append(reduction_module).reset_index(drop=True)

I think this is an improvement as it reduces the number of lines requiring tests and more clearly states that all you do is define reduction_joined, just slightly differently.

Sorry, but what is ._append() here? I thought pandas deprecated and removed methods like DataFrame.append() in v2.0 (2023), and I don't see this one in the documentation.

ah interesting - it seems that function may be a "private" function that isn't supported by pandas 🙈 i'll push a commit to switch to using concat or something else like that!

thanks @glatterf42 for the suggestion!

the if...else statement is actually not on module == "energy" but on if os.path.exists(ffile). if the file exists, then the code reads in the file as a dataframe, deletes some rows in another dataframe, and then concats the two dataframes together as the output. if the file doesn't exist, then the output is just the second dataframe copied over. i thought about condensing the code like you suggested but seems like the if part requires a few more steps? 😅

Oh, sorry, I misunderstood what was going on. I guess its technically possible to still make this one line, but I don't know if it will be easier to read. Up to you if you want to try this :)

if it's alright, i would prefer to skip editing this part for this PR and move forward with approving/merging -- thank you! :)

measrainsey added 2 commits December 12, 2024 11:56

Remove technology_type column from input CSV files

cf509f6

Remove instances in code that used technology_type column

d61e2ec

measrainsey added the costs `.tools.costs`/cost data preparation label Dec 12, 2024

measrainsey self-assigned this Dec 12, 2024

measrainsey changed the title ~~Costs/tech type~~ Remove technology_type column from tools.costs Dec 12, 2024

measrainsey marked this pull request as ready for review December 12, 2024 11:04

Add #269 to doc/whatsnew

1b61d33

glatterf42 approved these changes Dec 12, 2024

View reviewed changes

glatterf42 linked an issue Dec 12, 2024 that may be closed by this pull request

Cost reduction CSV has unnecessary technology_type column #260

Closed

Replace ._append() with pd.concat()

59c4ace

khaeru merged commit d55fbf9 into main Dec 16, 2024
30 checks passed

khaeru deleted the costs/tech-type branch December 16, 2024 12:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove `technology_type` column from `tools.costs` #269

Remove `technology_type` column from `tools.costs` #269

measrainsey commented Dec 12, 2024 •

edited by glatterf42

Loading

codecov bot commented Dec 12, 2024 •

edited

Loading

glatterf42 left a comment

glatterf42 Dec 12, 2024

khaeru Dec 12, 2024

measrainsey Dec 12, 2024

measrainsey Dec 12, 2024

glatterf42 Dec 13, 2024

measrainsey Dec 16, 2024

Remove technology_type column from tools.costs #269

Remove technology_type column from tools.costs #269

Conversation

measrainsey commented Dec 12, 2024 • edited by glatterf42 Loading

How to review

PR checklist

codecov bot commented Dec 12, 2024 • edited Loading

Codecov Report

glatterf42 left a comment

Choose a reason for hiding this comment

glatterf42 Dec 12, 2024

Choose a reason for hiding this comment

khaeru Dec 12, 2024

Choose a reason for hiding this comment

measrainsey Dec 12, 2024

Choose a reason for hiding this comment

measrainsey Dec 12, 2024

Choose a reason for hiding this comment

glatterf42 Dec 13, 2024

Choose a reason for hiding this comment

measrainsey Dec 16, 2024

Choose a reason for hiding this comment

Remove `technology_type` column from `tools.costs` #269

Remove `technology_type` column from `tools.costs` #269

measrainsey commented Dec 12, 2024 •

edited by glatterf42

Loading

codecov bot commented Dec 12, 2024 •

edited

Loading