All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
- NullPointerException in tearDown of
StandaloneHiveRunner
.
- Upgraded Hive version to 2.3.7 (was 2.3.6) (allows HiveRunner to be used on JDK>=9).
- Upgraded Hive version to 2.3.6 (was 2.3.4).
- Upgraded JUnit Jupiter version to 5.6.0 (was 5.5.1).
- Depend on
junit-jupiter
instead ofjunit-jupiter-api
.
- Default supported Hive version is now 2.3.4 (was 2.3.3) as version 2.3.3 has a vulnerability.
TemporaryFolder
(JUnit 4) has been changed toPath
(Java NIO) throughout the project for the JUnit5 update.- NOTE: The
HiveServerContext
class now usesPath
instead ofTemporaryFolder
in the constructor.
- Internal refactoring to support upcoming "Mutant Swarm" project which provides unit test coverage for Hive SQL scripts. See #65.
- Support shell-specific
source
(hive
) and!run
(beeline
) commands. These commands allow one to import and execute the contents of external files in statements or scripts.
- Default supported Hive version is now 2.3.3 (was 1.2.1).
- Default supported Tez version is now 0.9.1 (was 0.7.0).
- Supported Java version is 8 (was 7).
- In-memory DB used by HiveRunner is now Derby (was HSQLDB).
- Log4J configuration file removed from jar artifact.
- System property to configure command shell emulation mode renamed to
commandShellEmulator
(wascommandShellEmulation
).
- Fixed issue where if case of column name in a file was different to case in table definition they would be treated as different #73.
- The way of setting writable permissions on JUnit temporary folder changed to make it compatible with Windows #63.
- Added functionality for headers in TSV parser. This way you can dynamically add TSV files declaring a subset of columns using insertInto.
- Added debug logging of result set. Enable by setting
log4j.logger.com.klarna.hiverunner.HiveServerContainer=DEBUG
in log4j.properties.
- Added methods to the shell that allow statements contained in files to be executed and their results gathered. These are particularly useful for HQL scripts that generate no table based data and instead write results to STDOUT. In practice we've seen these scripts used in data processing job orchestration scripts (e.g
bash
) to check for new data, calculate processing boundaries, etc. These values are then used to appropriately configure and launch some downstream job. - Support abstract base class #48.
- Upgraded to Hive 1.2.1 (Note: new major release with backwards incompatibility issues). As of Hive 1.2 there are a number of new reserved keywords, see DDL manual for more information. If you happen to have one of these as an identifier, you could either backtick quote them (e.g. `date`, `timestamp` or `update`) or set hive.support.sql11.reserved.keywords=false.
- Users of Hive version 0.14 or older are recommended to use HiveRunner version 2.6.0.
- Removed the custom HiveConf hive.vs. Use hadoop.tmp.dir instead.
- Introduced command shell emulations to replicate different handling of full line comments in
hive
andbeeline
shells. Now strips full line comments for executed scripts to match the behaviour of thehive -f
file option. - Option to use files as input for com.klarna.hiverunner.HiveShell.execute(...).
- Fixed deadlock in
ThrowOnTimeout.java
that occurred when running with long running test case and disabled timeout.
- Added support with
HiveShell.insertInto
for fluently generating test data in a table storage format agnostic manner.
- Enabled any hiveconf variables to be set as System properties by using the naming convention hiveconf_[HiveConf property name]. e.g: hiveconf_hive.execution.engine.
- Fixed bug: Results sets bigger than 100 rows only returned the first 100 rows.
-
Merged Tez and MR context into the same context again. Now, the same test suite may alter between execution engines by doing e.g.:
hive> set hive.execution.engine=tez; hive> [some query] hive> set hive.execution.engine=mr; hive> [some query]
- Added support for setting hivevars via HiveShell.