- Fix deprecation warning by setting SSLContext protocol.
- Add hotfix for
lineterminator
change in Pandas 1.5.* (details)
- HTTP transport callbacks are now executed inside a context manager for read or write pipe. It guarantees that pipe will be closed in the main thread regardless of successful execution -OR- exception in callback function. It should help to prevent certain edge cases with pipes on Windows, when pipe
.close()
can block if called in unexpected order. - HTTP transport "server" termination was simplified. Now it always closes "write" end of pipe first, followed by "read" end of pipe.
- Attempt to fix GitHub action SSL errors.
- Encryption is now enabled by default both for WebSocket and HTTP transport. Non-encrypted connections will be disabled in Exasol 8.0.
It may introduce some extra CPU overhead. If it becomes a problem, you may still disable encryption explicitly by setting encryption=False
.
- SSL certificate verification is now enabled when used with
access_token
orrefresh_token
connection options. - Updated documentation regarding encryption.
OpenID tokens are used to connect to Exasol SAAS clusters, which are available using public internet address. Unlike Exasol clusters running "on-premises" and secured by corporate VPNs and private networks, SAAS clusters are at risk of MITM attacks. SSL certificates must be verified in such conditions.
Exasol SAAS certificates are properly configured using standard certificate authority, so no extra configuration is required.
- Added initial implementation of certificate fingerprint validation, similar to standard JDBC / ODBC clients starting from version 7.1+.
- Replaced most occurrences of ambiguous word
host
in code withhostname
oripaddr
, depending on context.
- Improved termination logic for HTTP transport thread while handling an exception. Order of closing pipes now depends on type of callback (EXPORT or IMPORT). It should help to prevent unresponsive infinite loop on Windows.
- Improved parallel HTTP transport examples with better exception handling.
- Removed comment about
if __name__ == '__main__':
being required for Windows OS only. Multiprocessing on macOS usesspawn
method in the most recent Python versions, so it is no longer unique. pyopenssl
is now a hard dependency, which is required for encrypted HTTP transport to generate an "ad-hoc" certificate. Encryption will be enabled by default for SAAS Exasol in future.
- Added
orjson
as possible option forjsob_lib
connection parameter. - Default
indent
for JSON debug output is now 2 (was 4) for more compact representation. ensure_ascii
is nowFalse
(wasTrue
) for better readability and smaller payload size.- Fixed JSON examples,
fetch_mapper
is now set correctly for second data set.
BREAKING (!): HTTP transport was significantly reworked in this version. Now it uses threading instead of subprocess to handle CSV data streaming.
There are no changes in a common single-process HTTP transport.
There are some breaking changes in parallel HTTP transport:
- Argument
mode
was removed fromhttp_transport()
function, it is no longer needed. - Word "proxy" used in context of HTTP transport was replaced with "exa_address" in documentation and code. Word "proxy" now refers to connections routed through an actual HTTP proxy only.
- Function
ExaHTTPTransportWrapper.get_proxy()
was replaced with propertyExaHTTPTransportWrapper.exa_address
. Function.get_proxy()
is still available for backwards compatibility, but it is deprecated. - Module
pyexasol_utils.http_transport
no longer exists. - Constants
HTTP_EXPORT
andHTTP_IMPORT
are no longer exposed inpyexasol
module.
Rationale:
- Threading provides much better compatibility with Windows OS and various exotic setups (e.g. uWSGI).
- Orphan "http_transport" processes will no longer be a problem.
- Modern Pandas and Dask can (mostly) release GIL when reading or writing CSV streams.
- HTTP thread is primarily dealing with network I/O and zlib compression, which (mostly) release GIL as well.
Execution time for small data sets might be improved by 1-2s, since another Python interpreter is no longer started from scratch. Execution time for very large data sets might be ~2-5% worse for CPU bound workloads and unchanged for network bound workloads.
Also, examples were re-arranged in this version, refactored and grouped into multiple categories.
- Fixed a bug in
ExaStatement
when no rows were fetched. It could happen when data set has less than 1000 rows, but the amount of data exceeds maximum chunk size.
- "HTTP Transport" and "Script Output" subprocess will now restore default handler for SIGTERM signal.
In some cases custom signal handlers can be inherited from parent process, which causes unwanted side effects and prevents correct termination of child process.
- Default
protocol_version
is now 3. - Dropped support for Exasol versions
6.0
and6.1
.
These versions have reached "end of life" and are no longer supported by vendor. It is still possible to connect to older Exasol versions using PyEXASOL, but you may have to set protocol_version=1
connection option explicitly.
- Send
snapshotTransactionsEnabled
attribute only if set explicitly in connection option, prepare for Exasol 7.1 release. - Default
snapshot_transaction
connection options is nowNone
(database default), previously it wasFalse
.
- Added connection options
access_token
andrefresh_token
to support OpenID Connect in WebSocket Protocol V3. - PyEXASOL default protocol version will be upgraded to
3
if connection optionaccess_token
orrefresh_token
were used.
- Fixed orphan process check in HTTP Transport being enabled on Windows instead of POSIX OS.
- Enforced TCP keep-alive for HTTP transport connections for Linux, MacOS and Windows. Keep-alive is required to address Google Cloud firewall rules dropping idle connections after 10 minutes.
- Added INTERVAL DAY TO SECOND data type support for standard fetch mapper
exasol_mapper
. Now it returns instances of classExaTimeDelta
derived from Python datetime.timedelta.
It may potentially cause some issues with existing code. If it does, you may define your own custom fetch_mapper
. Alternatively, you may call ExaTimeDelta.to_interval()
or cast the object to string to get back to original values.
- Added
comment
parameter for HTTP transport. It allows adding custom SQL comments to EXPORT and IMPORT queries generated by HTTP transport query builders.
- Added
websocket_sslopt
connection option, to set custom SSL options for WebSocket connection. See WebSocket client code for more details. - Add a basic benchmark to compare performance of individual nodes. Documentation will be added shortly.
- Run Travis tests with lowest (3.6) and highest (3.9) supported Python versions only.
- Updated description and classifiers for PyPi.
- Fixed the problem with
delimit
HTTP transport parameter expectingNONE
value instead ofNEVER
.
- Added
protocol_version
connection option to adjust the protocol version requested by client (default:pyexasol.PROTOCOL_V1
). - Added
.protocol_version()
function to check the actual protocol version of established connection.
- Added
.meta.execute_meta_nosql()
function to run no SQL metadata commands introduced in Exasol v7.0+. - Function
.meta.execute_snapshot()
is not public. You may use it run complex metadata SQL queries in snapshot isolation mode.
- Added ability to execute no SQL metadata commands AND process the response as normal SQL-like result set. It does not change anything in public interface, but it might have an impact if you use custom overloaded
ExaStatement
class.
- Re-throw
BrokenPipeError
(and other sub-classes ofConnectionError
) asExaCommunicationError
. This type of errors might not be handled in WebSocket client library in certain cases.
- Added optional
disconnect
command executed during.close()
. It is now enabled by default , but can be disabled with explicit.close(disconnect=False)
to revert to original behaviour; - Added
csv_cols
to HTTP transport parameters. It allows to skip some columns in CSV and adjust numeric and date format during IMPORT and EXPORT. It is still recommended to implement your own data transformation layer, sincecsv_cols
capabilities are limited;
- Added
.meta
sub-set of functions to execute lock-free meta data requests using/*snapshot execution*/
SQL hint; - Deprecated some
.ext
functions executing queries similar to.meta
(code remains in place for compatibility); - Added connection option
connection_timeout
in addition to existing optionsocket_timeout
.Connection_timeout
is applied during initial connection only andsocket_timeout
is applied for all other requests, including actual login procedure. - Reworked error handling for HTTP transport to take care of even more complex failure scenarios;
- Reworked internals of SQL builder for IMPORT / EXPORT parameters;
ExaStatement
should now properly release result set handle after fetching and object termination;- Removed
weakref
, it was not related to previous garbage collector problems; - Renamed previously added
.connection_time
to.login_time
, which is more accurate name for this timer; - Query text length in
ExaQueryError
exception is now limited to 20k characters to prevent logs from bloating; - Fixed
open_schema
function withquote_ident=True
; .last_statement()
now always returns lastExaStatement
executed on this connection. Previously it was skipping technical queries fromExaExtension
(.ext);
- Added option
client_os_username
to specify custom client OS username. Previously username was always detected automatically withgetpass.getuser()
, but it might not work in some environments, like OpenShift.
- Added
.connection_time
property to measure execution time of two login requests required to establish connection.
- Reworked
close()
method. It is now sending basicOP_CLOSE
WebSocket frame instead ofdisconnect
command. - Method
close()
is now called implicitly during destruction ofExaConnection
object to terminate IDLE session and free resources on Exasol server side immediately. ExaFormatter
,ExaExtension
,ExaLogger
objects now have weak reference to the mainExaConnection
object. It helps to prevent circular reference problem which stoppedExaConnection
object from being processed by Python garbage collector.- Connection will be closed automatically after receiving
WebSocketException
and raisingExaCommunicationError
. It should prevent connection from being stuck in invalid state.
- Reworked script output code and moved it into
pyexasol_utils
module. The new way to start script output server in debug mode is:python -m pyexasol_utils.script_output
. Old call will produce the RuntimeException with directions. - Removed
.utils
sub-module. - Renamed
pyexasol_utils.http
intopyexasol_utils.http_transport
for consistency.
- Fixed bug of
.execute_udf_output()
not working with emptyudf_output_bind_address
. - Added function
_encrypt_password()
, logic was moved from.utils
. - Added function
_get_stmt_output_dir()
, logic was moved from.utils
. It is now possible to overload this function.
- Metadata functions (starting with
.ext.get_sys_*
) are now using/*snapshot execution*/
SQL hint described in IDEA-476 to prevent locks.
- Added
insert_multi
function to allow faster INSERT's for small data sets using prepared statement.
- DSN hostname ranges with zero-padding are now supported (e.g.
myhost01..16
). - Context manager ("with" statement) is now supported for connection object.
- Context manager ("with" statement) is now supported for statement object.
- Added read-only
.options
property holding original arguments used to create ExaConnection object. - Added read-only
.login_info
property holding response data of LOGIN command. - Added documentation for read-only
.attr
property holding attributes of current connection (autocommit state, etc.). - Removed undocumented
.meta
property, renamed it to.login_info
. - Removed undocumented
.last_stmt
property. Please use.last_statement()
function instead. - Removed most of exposed properties related to connection options (e.g.
.autocommit
). Please use.options
or.attr
instead.
- Added documentation for read-only
.execution_time
property holding wall-clock execution time of SQL statement.