Skip to content

Commit

Permalink
Iceberg: Adding a new column doesn't work
Browse files Browse the repository at this point in the history
In Iceberg adding a column is just a metadata operation.
Once we insert new data, the new schema is used, but the
parquet files before the schema change should be projected
to the latest schema.
  • Loading branch information
Fokko committed Jan 18, 2024
1 parent 88effa2 commit c6aa9c5
Show file tree
Hide file tree
Showing 3 changed files with 19 additions and 2 deletions.
4 changes: 2 additions & 2 deletions tests/integration/iceberg/docker-compose/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -38,9 +38,9 @@ WORKDIR ${SPARK_HOME}

ENV SPARK_VERSION=3.4.2
ENV ICEBERG_SPARK_RUNTIME_VERSION=3.4_2.12
ENV ICEBERG_VERSION=1.4.0
ENV ICEBERG_VERSION=1.4.3
ENV AWS_SDK_VERSION=2.20.18
ENV PYICEBERG_VERSION=0.4.0
ENV PYICEBERG_VERSION=0.5.1

RUN curl --retry 3 -s -C - https://daft-public-data.s3.us-west-2.amazonaws.com/distribution/spark-${SPARK_VERSION}-bin-hadoop3.tgz -o spark-${SPARK_VERSION}-bin-hadoop3.tgz \
&& tar xzf spark-${SPARK_VERSION}-bin-hadoop3.tgz --directory /opt/spark --strip-components 1 \
Expand Down
16 changes: 16 additions & 0 deletions tests/integration/iceberg/docker-compose/provision.py
Original file line number Diff line number Diff line change
Expand Up @@ -322,3 +322,19 @@
('123')
"""
)

spark.sql(
"""
CREATE OR REPLACE TABLE default.add_new_column
USING iceberg
AS SELECT
1 AS idx
UNION ALL SELECT
2 AS idx
UNION ALL SELECT
3 AS idx
"""
)

spark.sql("ALTER TABLE default.add_new_column ADD COLUMN name STRING")
spark.sql("INSERT INTO default.add_new_column VALUES (3, 'abc'), (4, 'def')")
1 change: 1 addition & 0 deletions tests/integration/iceberg/test_table_load.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ def test_daft_iceberg_table_open(local_iceberg_tables):
# "test_table_sanitized_character", # Bug in scan().to_arrow().to_arrow()
"test_table_version", # we have bugs when loading no files
"test_uuid_and_fixed_unpartitioned",
"add_new_column"
]


Expand Down

0 comments on commit c6aa9c5

Please sign in to comment.