The Iceberg Table Analysis CLI Tool evaluates Iceberg tables to identify how Upsolver optimizations can enhance efficiency.
It presents a side-by-side comparison of current metrics against potential improvements in scan durations, file counts, and file sizes, providing a straightforward assessment of optimization opportunities.
iceberg-diag
can be installed using either Brew or PIP, as detailed below:
- Python 3.8 or higher: Verify Python's installation:
python3 --version
To install iceberg-diag
using PIP, ensure you have the latest version of pip
:
pip install --upgrade pip
Then, install the package with pip
pip install iceberg-diag
Execute the following commands to install iceberg-diag
via Brew:
brew tap upsolver/iceberg-diag
brew install iceberg-diag
iceberg-diag [options]
-h
,--help
: Display the help message and exit.--profile PROFILE
: Set the AWS credentials profile for the session, defaults to the environment's current settings.--region REGION
: Set the AWS region for operations, defaults to the specified profile's default region.--database DATABASE
: Set the database name, will list all available iceberg tables if no--table-name
provided.--table-name TABLE_NAME
: Enter the table name or a glob pattern (e.g.,'*'
,'tbl_*'
).--remote
: Enable remote diagnostics by sending data to the Upsolver API for processing.
Provides more detailed analytics and includes information about file size reducations.-v, --verbose
: Enable verbose logging
-
Displaying help information:
iceberg-diag --help
-
Listing all available databases in profile:
iceberg-diag --profile <profile>
-
Listing all available iceberg tables in a given database:
iceberg-diag --profile <profile> --database <database>
-
Running diagnostics on a specific table in a specific AWS profile and region (completely locally):
iceberg-diag --profile <profile> --region <region> --database <database> --table-name '*'
-
Running diagnostics using
remote
optioniceberg-diag --profile <profile> --database <database> --table-name 'prod_*' --remote