Skip to content

Commit

Permalink
Remove BigQuery, Sybase, Impala and Redshift virtual schemas …
Browse files Browse the repository at this point in the history
…as they have been migrated (#450)

* #446: Remove Sybase and Bigquery dialects
  • Loading branch information
chiaradiamarcelo authored Jan 15, 2021
1 parent c13f5a8 commit 8cda00a
Show file tree
Hide file tree
Showing 70 changed files with 85 additions and 4,823 deletions.
3 changes: 0 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,6 @@ dependency-reduced-pom.xml
# Intellij recommends to share iml files, however, better don't share files which might be outdated
*.iml

# Integration tests
src/test/resources/integration/driver/hive/*.jar

# Others
.DS_Store
*.swp
Expand Down
10 changes: 4 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -133,9 +133,7 @@ Running the Virtual Schema requires a Java Runtime version 9 or later.
| [Apache Maven](https://maven.apache.org/) | Build tool | Apache License 2.0 |
| [Apache Trift][apache-trift] | Need for Hive integration test | Apache License 2.0 |
| [Exasol JDBC Driver][exasol-jdbc-driver] | JDBC driver for Exasol database | MIT License |
| [Exasol Testcontainers][exasol-testcontainers] | Exasol extension for the Testcontainers framework | MIT License |
| [HBase server][hbase-server] | The Hadoop database | Apache License 2.0 |
| [Hive JDBC Driver][hive-jdbc-driver] | JDBC driver for Hive database | Apache License 2.0 |
| [Java Hamcrest](http://hamcrest.org/JavaHamcrest/) | Checking for conditions in code via matchers | BSD License |
| [JUnit](https://junit.org/junit5) | Unit testing framework | Eclipse Public License 1.0 |
| [Mockito](http://site.mockito.org/) | Mocking framework | MIT License |
Expand Down Expand Up @@ -170,18 +168,18 @@ Running the Virtual Schema requires a Java Runtime version 9 or later.

[athena-dialect-doc]: https://github.com/exasol/athena-virtual-schema/blob/main/doc/user_guide/athena_user_guide.md
[aurora-dialect-doc]: doc/dialects/aurora.md
[big-query-dialect-doc]: doc/dialects/bigquery.md
[big-query-dialect-doc]: https://github.com/exasol/bigquery-virtual-schema/blob/main/doc/user_guide/bigquery_user_guide.md
[db2-dialect-doc]: https://github.com/exasol/mysql-virtual-schema/blob/main/doc/user_guide/db2_user_guide.md
[exasol-dialect-doc]: https://github.com/exasol/exasol-virtual-schema/blob/main/doc/dialects/exasol.md
[hive-dialect-doc]: https://github.com/exasol/hive-virtual-schema/blob/main/doc/user_guide/hive_user_guide.md
[impala-dialect-doc]: doc/dialects/impala.md
[impala-dialect-doc]: https://github.com/exasol/impala-virtual-schema/blob/main/doc/dialects/impala_user_guide.md
[mysql-dialect-doc]: https://github.com/exasol/mysql-virtual-schema/blob/main/doc/user_guide/mysql_user_guide.md
[oracle-dialect-doc]: https://github.com/exasol/oracle-virtual-schema/blob/main/doc/user_guide/oracle_user_guide.md
[postgresql-dialect-doc]: https://github.com/exasol/postgresql-virtual-schema/blob/main/doc/dialects/postgresql.md
[redshift-dialect-doc]: doc/dialects/redshift.md
[redshift-dialect-doc]: https://github.com/exasol/redshift-virtual-schema/blob/main/doc/dialects/redshift_user_guide.md
[sap-hana-dialect-doc]: https://github.com/exasol/hana-virtual-schema/blob/main/doc/user_guide/user_guide.md
[sql-server-dialect-doc]: https://github.com/exasol/sqlserver-virtual-schema/blob/main/doc/user_guide/sqlserver_user_guide.md
[sybase-dialect-doc]: doc/dialects/sybase.md
[sybase-dialect-doc]: https://github.com/exasol/sybase-virtual-schema/blob/main/doc/user_guide/sybase_user_guide.md
[teradata-dialect-doc]: https://github.com/exasol/teradata-virtual-schema/blob/main/doc/dialects/teradata.md

[vs-api]: https://github.com/exasol/virtual-schema-common-java/blob/master/doc/development/api/virtual_schema_api.md
46 changes: 32 additions & 14 deletions doc/changes/changes_6.0.0.md
Original file line number Diff line number Diff line change
@@ -1,29 +1,47 @@
# Exasol Virtual Schemas 6.0.0, released 2021-XX-XX

Code name: Migrated Oracle, DB2, SQL Server, Teradata, Hana, Hive and Athena dialect implementations to their own repositories.
Code name: Dialects migration.

## Summary

Please we aware you can not create Oracle, DB2 nor SQL Server Virtual Schemas using this JAR anymore.
- Oracle dialect implementation has been migrated to https://github.com/exasol/oracle-virtual-schema.
- DB2 dialect implementation has been migrated to https://github.com/exasol/db2-virtual-schema.
- SQL Server dialect implementation has been migrated to https://github.com/exasol/sqlserver-virtual-schema.
- Athena dialect implementation has been migrated to https://github.com/exasol/athena-virtual-schema.
- Teradata dialect implementation has been migrated to https://github.com/exasol/teradata-virtual-schema.
- Hive dialect implementation has been migrated to https://github.com/exasol/hive-virtual-schema.
- Hana dialect implementation has been migrated to https://github.com/exasol/hana-virtual-schema.
The following dialect implementations have been migrate to their own repositories:

- Oracle, moved to https://github.com/exasol/oracle-virtual-schema.
- DB2, moved to https://github.com/exasol/db2-virtual-schema.
- SQL Server, moved to https://github.com/exasol/sqlserver-virtual-schema.
- Athena, moved to https://github.com/exasol/athena-virtual-schema.
- Teradata, moved to https://github.com/exasol/teradata-virtual-schema.
- Hive, moved to https://github.com/exasol/hive-virtual-schema.
- Hana, moved to https://github.com/exasol/hana-virtual-schema.
- Big Query, moved to https://github.com/exasol/bigquery-virtual-schema.
- Sybase, moved to https://github.com/exasol/sybase-virtual-schema.
- Redshift, moved to https://github.com/exasol/redshift-virtual-schema.
- Impala, moved to https://github.com/exasol/impala-virtual-schema.

Please we aware you can not create Virtual Schemas of the mentioned above dialects using this JAR anymore.

## Refactoring

* #438: Removed Oracle dialect implementation as it has been migrated to https://github.com/exasol/mysql-virtual-schema.
* #440: Removed DB2 dialect implementation as it has been migrated to https://github.com/exasol/db2-virtual-schema.
* #442: Removed SQL Server dialect implementation as it has been migrated to https://github.com/exasol/sqlserver-virtual-schema.
* #444: Removed Athena dialect implementation as it has been migrated to https://github.com/exasol/athena-virtual-schema.
* #428: Removed Teradata, Hive and Hana dialects.
* #438: Removed Oracle dialect.
* #440: Removed DB2 dialect.
* #442: Removed SQL Server dialect.
* #444: Removed Athena dialect.
* #446: Removed Big Query, Sybase, Impala and Redshift dialects.

## Dependency updates

* Removed `org.testcontainers:oracle-xe:1.15.0`
* Removed `com.oracle.ojdbc:ojdbc8:19.3.0.0`
* Removed `org.testcontainers:mssqlserver:1.15.0`
* Removed `com.microsoft.sqlserver:mssql-jdbc:8.4.1.jre11`
* Removed `com.microsoft.sqlserver:mssql-jdbc:8.4.1.jre11`
* Removed `com.exasol:db-fundamentals-java:0.1.1`
* Removed `nl.jqno.equalsverifier:equalsverifier:3.5`
* Removed `com.exasol:exasol-jdbc:7.0.3`
* Removed `com.exasol:exasol-testcontainers:3.3.1`
* Removed `org.testcontainers:junit-jupiter:1.15.0`
* Removed `org.apache.hive:hive-jdbc:3.1.2`
* Removed `org.apache.thrift:libthrift:0.13.0`
* Removed `org.apache.hbase:hbase-server:2.3.3`
* Removed `com.exasol:test-db-builder-java:1.1.0`
* Removed `com.exasol:hamcrest-resultset-matcher:1.2.1`
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ See [PostgreSQLDialectIT](https://github.com/exasol/postgresql-virtual-schema/bl

In order not to create security issues make sure the data in the source database is not confidential (demo data only).

## Executing Enabled Integration Tests
## Executing Integration Tests

We use following [Maven life cycle phases](https://maven.apache.org/guides/introduction/introduction-to-the-lifecycle.html) for our integration tests:

Expand All @@ -64,30 +64,6 @@ Another way to run integration tests:

* Create a package of Virtual Schemas using `mvn package` command and run integration tests inside your IDE in the same way as unit tests.

List of enabled integration tests:

* ExasolSqlDialectIT (in [exasol-virtual-schema](https://github.com/exasol/exasol-virtual-schema) repository)
* PostgreSQLSqlDialectIT (in [postgresql-virtual-schema](https://github.com/exasol/postgresql-virtual-schema) repository)

## Executing Disabled Integration Tests

Some integration tests are not running automatically, but it is possible to execute them locally.
The reason for the tests being disabled is we can only deliver drivers where the license allows redistribution.
Therefore we cannot include some jdbc drivers to the projects and you need to download them manually for local integration testing.

List of disabled integration tests:

* HiveSqlDialectIT

### Starting Disabled Integration Test Locally

1. Download the [Hive JDBC driver `HiveJDBC41.jar`](https://www.cloudera.com/downloads/connectors/hive/jdbc/2-5-4.html)
2. Temporarily put the driver into `src/test/resources/integration/driver/hive` directory.

3. If the files' names are different (you renamed the file, or it has a different version number, for example) from the mentioned above, edit `src/test/resources/integration/driver/hive/hive.properties` and `settings.cfg` files.
4. Run the tests from an IDE or temporarily add the integration test name into the `maven-failsafe-plugin`'s includes a section and execute `mvn verify` command.
5. Remove the driver after the test and **do not upload it to the GitHub repository**.

## See also

* [Developing an SQL dialect](developing_a_dialect.md)
Expand Down
111 changes: 1 addition & 110 deletions doc/dialects/bigquery.md
Original file line number Diff line number Diff line change
@@ -1,112 +1,3 @@
# Big Query SQL Dialect

The Big Query SQL dialect allows you connect to the [Google Big Query](https://cloud.google.com/bigquery/), Google's serverless, enterprise data warehouse.

## JDBC Driver

Download the [Simba JDBC Driver for Google BigQuery](https://cloud.google.com/bigquery/providers/simba-drivers/).

## Uploading the JDBC Driver to EXAOperation

1. [Create a bucket in BucketFS](https://docs.exasol.com/administration/on-premise/bucketfs/create_new_bucket_in_bucketfs_service.htm)
1. [Upload the driver to BucketFS](https://docs.exasol.com/administration/on-premise/bucketfs/accessfiles.htm)

**Hint**: Magnitude Simba driver contains a lot of jar files, but you can upload all of them together as an archive (`.tar.gz`, for example).
The archive will be unpacked automatically in the bucket and you can access the files using the following path pattern '<your bucket>/<archive's name without extension>/<name of a file form the archive>.jar'

Leave only `.jar` files in the archive. It will help you to generate a list for adapter script later.

## Installing the Adapter Script

Upload the latest available release of [Virtual Schema JDBC Adapter](https://github.com/exasol/virtual-schemas/releases) to Bucket FS.

Then create a schema to hold the adapter script.

```sql
CREATE SCHEMA SCHEMA_FOR_VS_SCRIPT;
```

The SQL statement below creates the adapter script, defines the Java class that serves as entry point and tells the UDF framework where to find the libraries (JAR files) for Virtual Schema and database driver.

List all the JAR files from Magnitude Simba JDBC driver.

```sql
CREATE JAVA ADAPTER SCRIPT SCHEMA_FOR_VS_SCRIPT.ADAPTER_SCRIPT_BIGQUERY AS
%scriptclass com.exasol.adapter.RequestDispatcher;
%jar /buckets/<BFS service>/<bucket>/virtual-schema-dist-8.0.0-bundle-6.0.0.jar;
%jar /buckets/<BFS service>/<bucket>/GoogleBigQueryJDBC42.jar;
...
...
...
/
;
```

**Hint**: to avoid filling the list by hands, use a convenience UDF script [bucketfs_ls](https://github.com/exasol/exa-toolbox/blob/master/utilities/bucketfs_ls.sql).
Create a script and run it as in the following example:

```sql
SELECT '%jar /buckets/<BFS service>/<bucket>/<archive's name without extension if used>/'|| files || ';' FROM (SELECT EXA_toolbox.bucketfs_ls('/buckets/<BFS service>/<bucket>/<archive's name without extension if used>/') files );
```

## Defining a Named Connection

Please follow the [Authenticating to a Cloud API Service article](https://cloud.google.com/docs/authentication/) to get Google service account credentials.

Upload the key to BucketFS, then create a named connection:

```sql
CREATE OR REPLACE CONNECTION BIGQUERY_JDBC_CONNECTION
TO 'jdbc:bigquery://https://www.googleapis.com/bigquery/v2:443;ProjectId=<your project id>;OAuthType=0;OAuthServiceAcctEmail=<service account email>;OAuthPvtKeyPath=/<path to the bucket>/<name of the key file>';
```
You can find additional information about the [JDBC connection string in the Big Query JDBC installation guide](https://www.simba.com/products/BigQuery/doc/JDBC_InstallGuide/content/jdbc/using/intro.htm);

## Creating a Virtual Schema

Below you see how a Big Query Virtual Schema is created. Please note that you have to provide the name of a catalog and the name of a schema.

```sql
CREATE VIRTUAL SCHEMA <virtual schema name>
USING SCHEMA_FOR_VS_SCRIPT.ADAPTER_SCRIPT_BIGQUERY
WITH
SQL_DIALECT = 'BIGQUERY'
CONNECTION_NAME = 'BIGQUERY_JDBC_CONNECTION'
CATALOG_NAME = '<catalog name>'
SCHEMA_NAME = '<schema name>';
```


## Data Types Conversion

BigQuery Data Type | Supported | Converted Exasol Data Type| Known limitations
-------------------|-----------|---------------------------|-------------------
BOOLEAN | ✓ | BOOLEAN |
BYTES | × | |
DATE | ✓ | DATE |
DATETIME | ✓ | TIMESTAMP |
FLOAT | ✓ | DOUBLE | Expected range for correct mapping: -99999999.99999999 .. 99999999.99999999.
GEOGRAPHY | ✓ | VARCHAR(65535) |
INTEGER | ✓ | DECIMAL(19,0) |
NUMERIC | ✓ | VARCHAR(2000000) |
RECORD/STRUCT | × | |
STRING | ✓ | VARCHAR(65535) |
TIME | ✓ | VARCHAR(16) |
TIMESTAMP | ✓ | TIMESTAMP | Expected range for correct mapping: 1582-10-15 00:00:01 .. 9999-12-31 23:59:59.9999. JDBC driver maps dates before 1582-10-15 00:00:01 incorrectly. Example of incorrect mapping: 1582-10-14 22:00:01 -> 1582-10-04 22:00:01

If you need to use currently unsupported data types or find a way around known limitations, please, create a github issue in the [VS repository](https://github.com/exasol/virtual-schemas/issues).

## Performance

Please be aware that the current implementation of the dialect can only handle result sets with limited size (a few thousand rows).
If you need to process a large amount of data, please contact our support team. Another implementation of the dialect with a performance improvement (using `IMPORT INTO`) is available, but not documented for self-service because of

1. the complex installation process
2. security risks (a user has to disable the drivers' security manager to use it)

## Testing information

In the following matrix you find combinations of JDBC driver and dialect version that we tested.

Virtual Schema Version| Big Query Version | Driver Name | Driver Version
----------------------|---------------------|---------------------------------------------|------------------------
3.0.2 | Google BigQuery 2.0 | Magnitude Simba JDBC driver for BigQuery | 1.2.2.1004
The Big Query Virtual Schema has been migrated to its [own repository](https://github.com/exasol/bigquery-virtual-schema/).
Loading

0 comments on commit 8cda00a

Please sign in to comment.