Skip to content

Commit

Permalink
Add extra check if both values are NA in bulk_upsert (#53)
Browse files Browse the repository at this point in the history
* add extra check if both values are NA

* revert disabling dynamic versioning

* add some comments

* add tests with pyarrow

* run less tests with pyarrow

* run more tests without pyarrow

* run correct amount of tests, hopefully
  • Loading branch information
meksor authored Feb 29, 2024
1 parent 19ba8b4 commit c85d987
Show file tree
Hide file tree
Showing 3 changed files with 413 additions and 536 deletions.
16 changes: 12 additions & 4 deletions .github/workflows/pytest.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,14 @@ jobs:
strategy:
matrix:
python-version:
- "3.10" # Earliest version supported by ixmp4
- "3.11"
- "3.12" # Latest version supported by ixmp4

- "3.10" # Earliest version supported by ixmp4
- "3.11"
- "3.12" # Latest version supported by ixmp4
with-pyarrow:
- false
include:
- python-version: "3.12"
with-pyarrow: true
runs-on: ubuntu-latest
services:
postgres:
Expand Down Expand Up @@ -65,6 +69,10 @@ jobs:
#----------------------------------------------
# install your root project, if required
#----------------------------------------------
- name: Install PyArrow
if: ${{ matrix.with-pyarrow }}
run: pip install pyarrow

- name: Install library
run: poetry install --no-interaction
#----------------------------------------------
Expand Down
9 changes: 8 additions & 1 deletion ixmp4/data/db/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -389,7 +389,14 @@ def bulk_upsert_chunk(self, df: pd.DataFrame) -> None:
for col in self.model_class.updateable_columns:
updated_col = col + self.merge_suffix
if updated_col in df.columns:
cond.append(df[col] != df[updated_col])
# coerce to same type so the inequality
# operation works with pyarrow installed
df[updated_col] = df[updated_col].astype(df[col].dtype)
are_not_equal = df[col] != df[updated_col]
# extra check if both values are NA because NA == NA = NA
# in pandas with pyarrow
both_are_na = pd.isna(df[col]) & pd.isna(df[updated_col])
cond.append(~both_are_na | are_not_equal)

df["differs"] = np.where(np.logical_or.reduce(cond), True, False)

Expand Down
Loading

0 comments on commit c85d987

Please sign in to comment.