Cask Data Application Platform v3.2.0
sreevatsanraman
released this
24 Sep 04:56
·
21366 commits
to develop
since this release
New Features
- Added support for HBase1.1.(CDAP-2556)
- Added a new API for creating an application from an artifact.(CDAP-2666)
- Added the ability to write to multiple outputs from a MapReduce job.(CDAP-2756)
- Added the ability to dynamically write to multiple partitions of a PartitionedFileSet dataset as the output of a MapReduce job.(CDAP-2757)
- Added a Stream and Dataset Widget to the CDAP-UI.(CDAP-3253)
- Added stream views, enabling reading from a single stream using various formats and schemas.(CDAP-3390)
- Added a Validator Transform that can be used to validate records based on a set of available validators and configured to write invalid records to an error dataset.(CDAP-3476)
- Added a service to manage the metadata of CDAP entities.(CDAP-3516)
- Added the publishing of metadata change notifications to Apache Kafka.(CDAP-3518)
- Added the ability to compute lineage of a CDAP dataset or stream in a given time window.(CDAP-3519)
- Added RESTful APIs for adding/retrieving/deleting of metadata for apps/programs/datasets/streams.(CDAP-3520)
- Added the ability to record a dataset or stream access by a CDAP program.(CDAP-3521)
- Added the capability to search CDAP entities based on their metadata.(CDAP-3522)
- Added RESTful APIs for searching CDAP entities based on business metadata.(CDAP-3523)
- Added a data store to manage business metadata of CDAP entities.(CDAP-3527)
- Added SSH port forwarding to the CDAP virtual machine.(CDAP-3549)
- Added a data store for recording data accesses by CDAP programs and computing lineage.(CDAP-3556)
- Added the ability to write to multiple sinks in ETL real-time and batch applications.(CDAP-3590)
- Added the ability for real-time ETL pipelines to write to multiple sinks.(CDAP-3591)
- Added the ability for batch ETL pipelines to write to multiple sinks.(CDAP-3592)
- For the CSV and TSV stream formats, a “mapping” setting can now be specified, mapping stream event columns to schema columns.(CDAP-3626)
- Added support for CDAP to work with HDP 2.3.(CDAP-3693)
Improvements
- Added documentation of the RESTful endpoint to retrieve the properties of a stream.(CDAP-1914)
- Added an interface to load a file into a stream from the CDAP-UI.(CDAP-2514)
- The CDAP-UI “Errors” pop-up in the main screen now displays the time and date for each error.(CDAP-2809)
- Updated the Cloudera Manager CSD to use support for logback.(CDAP-2872)
- Cleaned up the messages shown in the errors dropdown in the CDAP-UI.(CDAP-2950)
- Added a CDAP-CLI command to stop a workflow.(CDAP-3147)
- Added support for upgrading the Hadoop distribution or the HBase version that CDAP is running on.(CDAP-3179)
- Revised the documentation of the file cdap-default.xml, removed properties no longer in use, and corrected discrepancies between the documentation and the shipped XML file.(CDAP-3257)
- Improved the help provided in the CDAP-CLI for the setting of stream formats.(CDAP-3270)
- Upgraded netty-http version to 0.12.0.(CDAP-3275)
- Added a HTTP RESTful API to update the application configuration and artifact version.(CDAP-3282)
- Added a “clear” button in the CDAP-UI for cases where a user decides to not used a pre-populated schema.(CDAP-3332)
- Defined a directory structure to be used for predefined applications.(CDAP-3351)
- Added documentation in the source code on adding new commands and completers to the CDAP-CLI.(CDAP-3357)
- In the CDAP-UI, added visualization for Workflow tokens in Workflows.(CDAP-3393)
- HBaseQueueDebugger now shows the minimum queue event transaction write pointer both for each queue and for all queues.(CDAP-3419)
- Added an example cdap-env.sh to the shipped packages.(CDAP-3443)
- Added an example in the documentation explaining how to prune invalid transactions from the transaction manager.(CDAP-3464)
- Modified the CDAP upgrade tool to delete all adapters and the ETLBatch and ETLRealtime ApplicationTemplates.(CDAP-3490)
- Added the ability to persist the runtime arguments with which a program was run.(CDAP-3495)
- Added support for writing to Amazon S3 in Avro and Parquet formats from batch ETL applications.(CDAP-3550)
- Updated CDAP to use Tephra 0.6.2.(CDAP-3564)
- Updated the transaction debugger client to print checkpoint information.(CDAP-3610)
Bug Fixes
- Fixed an issue where failed dataset operations via Explore queries did not invalidate the associated transaction.(CDAP-1697)
- Fixed a problem where users got an incorrect message while creating a dataset in a non-existent namespace.(CDAP-1864)
- Fixed a problem with services returning the same message for all failures.(CDAP-1892)
- Fixed a problem where a dataset could be created in a non-existent namespace in standalone mode.(CDAP-1984)
- Fixed a problem with the CDAP-CLI creating file logs.(CDAP-2428)
- Fixed a problem with the CDAP-CLI not auto-completing when setting a stream format.(CDAP-2521)
- Fixed a problem with the CDAP-UI of buttons staying ‘in focus’ after clicking.(CDAP-2785)
- The CDAP-UI “Errors” pop-up in the main screen now displays the time and date for each error.(CDAP-2809)
- Fixed a problem with schedules not being deployed in suspended mode.(CDAP-2892)
- Fixed a problem where failure of a spark node would cause a workflow to restart indefinitely.(CDAP-3014)
- Fixed an issue with the CDAP standalone process periodically crashing with Out-of-Memory errors when writing to an Oracle table.(CDAP-3073)
- Fixed a problem with workflow runs not getting scheduled due to Quartz exceptions.(CDAP-3101)
- Fixed a problem with discrepancies between the documentation and the defaults actually used by CDAP.(CDAP-3121)
- Fixed a problem in the CDAP-UI with the clone button in an incorrect position when using Firefox.(CDAP-3200)
- Fixed a problem in the CDAP-UI with an incorrect tabbing order when using Firefox.(CDAP-3201)
- Fixed a problem when specifying the HBase version using the HBASE_VERSION environment variable.(CDAP-3219)
- Fixed a problem in the CDAP-UI error pop-ups not having a default focus on a button.(CDAP-3233)
- Fixed a problem in the CDAP-UI with the default schema shown for streams.(CDAP-3243)
- Fixed a problem in the CDAP-UI with scrolling on the namespaces dropdown on certain pages.(CDAP-3260)
- Fixed a problem on CDAP distributed mode with the serializing of the metadata artifact causing a stack overflow.(CDAP-3261)
- Fixed a problem in the CDAP-UI not warning users if they exit or close their browser without saving.(CDAP-3305)
- Fixed a problem in the CDAP-UI with refreshing always returning to the overview page.(CDAP-3313)
- Fixed a problem with the table batch source requiring a row key to be set.(CDAP-3326)
- Fixed a problem with the application deployment for apps that contain Spark.(CDAP-3343)
- Fixed a problem with the display of ETL application metrics in the CDAP-UI.(CDAP-3349)
- Fixed a problem in the CDAP examples with the use of a runtime argument, min.pages.threshold.(CDAP-3355)
- Fixed a problem with the logback-container.xml not being copied into master services.(CDAP-3362)
- Fixed a problem with warning messages in the logs indicating that programs were running that actually were not running.(CDAP-3374)
- Fixed a problem with being unable to deploy the SparkPageRank example application on a cluster.(CDAP-3376)
- Fixed a problem with the Spark classes not being found when running a Spark program through a Workflow in CDAP Distributed mode on HDP 2.2.(CDAP-3386)
- Fixed a problem with the deployment of applications through the CDAP-UI.(CDAP-3394)
- Fixed a problem with the SparkPageRankApp example spawning multiple containers in distributed mode due to its number of services.(CDAP-3399)
- Fixed an issue with warning messages about the notification system every time the CDAP Standalone is restarted.(CDAP-3400)
- Fixed a problem with running the CDAP Explore Service on CDH 5.[2,3].(CDAP-3408)
- Fixed a bug where connecting with a certain namespace from the CLI would not immediately display that namespace in the CLI prompt.(CDAP-3432)
- Fixed an issue where the program status was shown as running even after it is stopped.(CDAP-3435)
- Fixed a problem that caused application creation to fail if a config setting was given to an application that does not use a config.(CDAP-3442)
- Fixed a problem with the readless increment co-processor not handling multiple readless increment columns in the same row.(CDAP-3449)
- Fixed a problem that prevented explore service working on clusters with secure hive 0.14.(CDAP-3452)
- Fixed a problem where streams events that had already been processed were re-processed in flows.(CDAP-3458)
- Fixed an issue with error messages being logged during a master process restart.(CDAP-3470)
- Fixed the error message returned when trying to stop a program started by a workflow.(CDAP-3472)
- Fixed a problem with a workflow failure not updating a run record for the inner program.(CDAP-3473)
- Fixed a problem with the CDAP-UI performance when rendering flow diagrams with a large number of nodes.(CDAP-3530)
- Removed faulty and unused metrics around CDAP file resource usage.(CDAP-3563)
- Fix an issue with Explore not working on HDP Hive 0.12.(CDAP-3574)
- Fixed an issue with configuration properties for ETL Transforms being validated at runtime instead of when an application is created.(CDAP-3603)
- Fix a problem where suspended schedules were lost when CDAP master was restarted.(CDAP-3618)
- Fixed and issue where the Hadoop filesystem object was getting instantiated before the Kerberos keytab login was completed, leading to CDAP processes failing after the initial ticket expired.(CDAP-3660)
- Fixed an issue with the log saver having numerous open connections to HBase, causing it to go Out-of-Memory.(CDAP-3700)
- Fixed an issue that prevented the downloading of Explore results on a secure cluster.(CDAP-3711)
- Fixed an issue where certain RESTful APIs were not returning appropriate error messages for internal server errors.(CDAP-3713)
- Fixed a possible deadlock when CDAP master is restarted with an existing app running on a cluster.(CDAP-3716)