Skip to content

Cask Data Application Platform v3.2.0

Compare
Choose a tag to compare
@sreevatsanraman sreevatsanraman released this 24 Sep 04:56
· 21366 commits to develop since this release

New Features

  • Added support for HBase1.1.(CDAP-2556)
  • Added a new API for creating an application from an artifact.(CDAP-2666)
  • Added the ability to write to multiple outputs from a MapReduce job.(CDAP-2756)
  • Added the ability to dynamically write to multiple partitions of a PartitionedFileSet dataset as the output of a MapReduce job.(CDAP-2757)
  • Added a Stream and Dataset Widget to the CDAP-UI.(CDAP-3253)
  • Added stream views, enabling reading from a single stream using various formats and schemas.(CDAP-3390)
  • Added a Validator Transform that can be used to validate records based on a set of available validators and configured to write invalid records to an error dataset.(CDAP-3476)
  • Added a service to manage the metadata of CDAP entities.(CDAP-3516)
  • Added the publishing of metadata change notifications to Apache Kafka.(CDAP-3518)
  • Added the ability to compute lineage of a CDAP dataset or stream in a given time window.(CDAP-3519)
  • Added RESTful APIs for adding/retrieving/deleting of metadata for apps/programs/datasets/streams.(CDAP-3520)
  • Added the ability to record a dataset or stream access by a CDAP program.(CDAP-3521)
  • Added the capability to search CDAP entities based on their metadata.(CDAP-3522)
  • Added RESTful APIs for searching CDAP entities based on business metadata.(CDAP-3523)
  • Added a data store to manage business metadata of CDAP entities.(CDAP-3527)
  • Added SSH port forwarding to the CDAP virtual machine.(CDAP-3549)
  • Added a data store for recording data accesses by CDAP programs and computing lineage.(CDAP-3556)
  • Added the ability to write to multiple sinks in ETL real-time and batch applications.(CDAP-3590)
  • Added the ability for real-time ETL pipelines to write to multiple sinks.(CDAP-3591)
  • Added the ability for batch ETL pipelines to write to multiple sinks.(CDAP-3592)
  • For the CSV and TSV stream formats, a “mapping” setting can now be specified, mapping stream event columns to schema columns.(CDAP-3626)
  • Added support for CDAP to work with HDP 2.3.(CDAP-3693)

Improvements

  • Added documentation of the RESTful endpoint to retrieve the properties of a stream.(CDAP-1914)
  • Added an interface to load a file into a stream from the CDAP-UI.(CDAP-2514)
  • The CDAP-UI “Errors” pop-up in the main screen now displays the time and date for each error.(CDAP-2809)
  • Updated the Cloudera Manager CSD to use support for logback.(CDAP-2872)
  • Cleaned up the messages shown in the errors dropdown in the CDAP-UI.(CDAP-2950)
  • Added a CDAP-CLI command to stop a workflow.(CDAP-3147)
  • Added support for upgrading the Hadoop distribution or the HBase version that CDAP is running on.(CDAP-3179)
  • Revised the documentation of the file cdap-default.xml, removed properties no longer in use, and corrected discrepancies between the documentation and the shipped XML file.(CDAP-3257)
  • Improved the help provided in the CDAP-CLI for the setting of stream formats.(CDAP-3270)
  • Upgraded netty-http version to 0.12.0.(CDAP-3275)
  • Added a HTTP RESTful API to update the application configuration and artifact version.(CDAP-3282)
  • Added a “clear” button in the CDAP-UI for cases where a user decides to not used a pre-populated schema.(CDAP-3332)
  • Defined a directory structure to be used for predefined applications.(CDAP-3351)
  • Added documentation in the source code on adding new commands and completers to the CDAP-CLI.(CDAP-3357)
  • In the CDAP-UI, added visualization for Workflow tokens in Workflows.(CDAP-3393)
  • HBaseQueueDebugger now shows the minimum queue event transaction write pointer both for each queue and for all queues.(CDAP-3419)
  • Added an example cdap-env.sh to the shipped packages.(CDAP-3443)
  • Added an example in the documentation explaining how to prune invalid transactions from the transaction manager.(CDAP-3464)
  • Modified the CDAP upgrade tool to delete all adapters and the ETLBatch and ETLRealtime ApplicationTemplates.(CDAP-3490)
  • Added the ability to persist the runtime arguments with which a program was run.(CDAP-3495)
  • Added support for writing to Amazon S3 in Avro and Parquet formats from batch ETL applications.(CDAP-3550)
  • Updated CDAP to use Tephra 0.6.2.(CDAP-3564)
  • Updated the transaction debugger client to print checkpoint information.(CDAP-3610)

Bug Fixes

  • Fixed an issue where failed dataset operations via Explore queries did not invalidate the associated transaction.(CDAP-1697)
  • Fixed a problem where users got an incorrect message while creating a dataset in a non-existent namespace.(CDAP-1864)
  • Fixed a problem with services returning the same message for all failures.(CDAP-1892)
  • Fixed a problem where a dataset could be created in a non-existent namespace in standalone mode.(CDAP-1984)
  • Fixed a problem with the CDAP-CLI creating file logs.(CDAP-2428)
  • Fixed a problem with the CDAP-CLI not auto-completing when setting a stream format.(CDAP-2521)
  • Fixed a problem with the CDAP-UI of buttons staying ‘in focus’ after clicking.(CDAP-2785)
  • The CDAP-UI “Errors” pop-up in the main screen now displays the time and date for each error.(CDAP-2809)
  • Fixed a problem with schedules not being deployed in suspended mode.(CDAP-2892)
  • Fixed a problem where failure of a spark node would cause a workflow to restart indefinitely.(CDAP-3014)
  • Fixed an issue with the CDAP standalone process periodically crashing with Out-of-Memory errors when writing to an Oracle table.(CDAP-3073)
  • Fixed a problem with workflow runs not getting scheduled due to Quartz exceptions.(CDAP-3101)
  • Fixed a problem with discrepancies between the documentation and the defaults actually used by CDAP.(CDAP-3121)
  • Fixed a problem in the CDAP-UI with the clone button in an incorrect position when using Firefox.(CDAP-3200)
  • Fixed a problem in the CDAP-UI with an incorrect tabbing order when using Firefox.(CDAP-3201)
  • Fixed a problem when specifying the HBase version using the HBASE_VERSION environment variable.(CDAP-3219)
  • Fixed a problem in the CDAP-UI error pop-ups not having a default focus on a button.(CDAP-3233)
  • Fixed a problem in the CDAP-UI with the default schema shown for streams.(CDAP-3243)
  • Fixed a problem in the CDAP-UI with scrolling on the namespaces dropdown on certain pages.(CDAP-3260)
  • Fixed a problem on CDAP distributed mode with the serializing of the metadata artifact causing a stack overflow.(CDAP-3261)
  • Fixed a problem in the CDAP-UI not warning users if they exit or close their browser without saving.(CDAP-3305)
  • Fixed a problem in the CDAP-UI with refreshing always returning to the overview page.(CDAP-3313)
  • Fixed a problem with the table batch source requiring a row key to be set.(CDAP-3326)
  • Fixed a problem with the application deployment for apps that contain Spark.(CDAP-3343)
  • Fixed a problem with the display of ETL application metrics in the CDAP-UI.(CDAP-3349)
  • Fixed a problem in the CDAP examples with the use of a runtime argument, min.pages.threshold.(CDAP-3355)
  • Fixed a problem with the logback-container.xml not being copied into master services.(CDAP-3362)
  • Fixed a problem with warning messages in the logs indicating that programs were running that actually were not running.(CDAP-3374)
  • Fixed a problem with being unable to deploy the SparkPageRank example application on a cluster.(CDAP-3376)
  • Fixed a problem with the Spark classes not being found when running a Spark program through a Workflow in CDAP Distributed mode on HDP 2.2.(CDAP-3386)
  • Fixed a problem with the deployment of applications through the CDAP-UI.(CDAP-3394)
  • Fixed a problem with the SparkPageRankApp example spawning multiple containers in distributed mode due to its number of services.(CDAP-3399)
  • Fixed an issue with warning messages about the notification system every time the CDAP Standalone is restarted.(CDAP-3400)
  • Fixed a problem with running the CDAP Explore Service on CDH 5.[2,3].(CDAP-3408)
  • Fixed a bug where connecting with a certain namespace from the CLI would not immediately display that namespace in the CLI prompt.(CDAP-3432)
  • Fixed an issue where the program status was shown as running even after it is stopped.(CDAP-3435)
  • Fixed a problem that caused application creation to fail if a config setting was given to an application that does not use a config.(CDAP-3442)
  • Fixed a problem with the readless increment co-processor not handling multiple readless increment columns in the same row.(CDAP-3449)
  • Fixed a problem that prevented explore service working on clusters with secure hive 0.14.(CDAP-3452)
  • Fixed a problem where streams events that had already been processed were re-processed in flows.(CDAP-3458)
  • Fixed an issue with error messages being logged during a master process restart.(CDAP-3470)
  • Fixed the error message returned when trying to stop a program started by a workflow.(CDAP-3472)
  • Fixed a problem with a workflow failure not updating a run record for the inner program.(CDAP-3473)
  • Fixed a problem with the CDAP-UI performance when rendering flow diagrams with a large number of nodes.(CDAP-3530)
  • Removed faulty and unused metrics around CDAP file resource usage.(CDAP-3563)
  • Fix an issue with Explore not working on HDP Hive 0.12.(CDAP-3574)
  • Fixed an issue with configuration properties for ETL Transforms being validated at runtime instead of when an application is created.(CDAP-3603)
  • Fix a problem where suspended schedules were lost when CDAP master was restarted.(CDAP-3618)
  • Fixed and issue where the Hadoop filesystem object was getting instantiated before the Kerberos keytab login was completed, leading to CDAP processes failing after the initial ticket expired.(CDAP-3660)
  • Fixed an issue with the log saver having numerous open connections to HBase, causing it to go Out-of-Memory.(CDAP-3700)
  • Fixed an issue that prevented the downloading of Explore results on a secure cluster.(CDAP-3711)
  • Fixed an issue where certain RESTful APIs were not returning appropriate error messages for internal server errors.(CDAP-3713)
  • Fixed a possible deadlock when CDAP master is restarted with an existing app running on a cluster.(CDAP-3716)