Skip to content

Cask Data Application Platform v3.4.1

Choose a tag to compare
@awholegunch awholegunch released this 13 May 18:07
· 17288 commits to develop since this release

Bug Fixes

  • Fixed a race condition bug in ResourceCoordinator that prevented
    performing partition assignment in the correct order. It affects the
    metrics processor and stream coordinator.
  • Avoid the cancellation of delegation tokens upon completion of
    Explore-launched MapReduce and Spark jobs, as these delegation tokens
    are shared by CDAP system services.
  • Removed 'SNAPSHOT' from the artifact version of apps created by
    default by the CDAP UI. This fixes deploying Cask Tracker and Navigator
    apps, enabling Cask Tracker from the CDAP UI.
  • Fixed a bug that caused SDK builds to fail when using 3.3.x versions
    of maven. (CDAP-5884)
  • Fixed the Hydrator upgrade tool to correctly write out pipeline
    configs that failed to upgrade.
  • The CDAP Standalone now deploys and starts the Cask Tracker app in the
    default namespace if the Tracker artifact is present.
  • Shutdown external processes started by CDAP (Zookeeper and Kafka) when
    there is an error during either startup or shutdown of CDAP.
  • Fixed an issue where parsing of an AVRO schema was failing when it
    included optional fields such as 'doc' or 'default'.
  • Fixed a bug in the BatchReadableRDD so that it won't skip records when
    used by DataFrame. (CDAP-5947)

Known Issues

  • After upgrading CDAP from a pre-3.0 version, any unprocessed metrics
    data in Kafka will be lost and WARN log messages will be logged that
    tell about the inability to process old data in the old format.
  • When running secure Hadoop clusters, debug logs from MapReduce
    programs are not available.
  • If the Hive Metastore is restarted while the CDAP Explore Service is
    running, the Explore Service remains alive, but becomes unusable. To
    correct, restart the CDAP Master — which will restart all services — as
    described under "Starting CDAP Services" for your particular Hadoop
    distribution in the Installation
  • CDAP internally creates tables in the "user" space that begin with the
    word "system". User datasets with names starting with "system"
    can conflict if they were to match one of those names. To avoid this, do
    not start any datasets with the word "system".
  • The application in the cdap-kafka-ingest-guide
    does not run on Ubuntu 14.x as of CDAP 3.0.x. (CDAP-2632)
  • Metrics for :ref:FileSets <datasets-fileset> can show zero values
    even if there is data present, because FileSets do not emit metrics
  • A workflow that is scheduled by time will not be run between the
    failure of the primary master and the time that the secondary takes
    over. This scheduled run will not be triggered at all.
  • Spark jobs on a Kerberos-enabled CDAP cluster cannot run longer than
    the delegation token expiration.
  • If the input partition filter for a PartitionedFileSet does not match
    any partitions, MapReduce jobs can fail.
  • The Workflow token is in an inconsistent state for nodes in a fork
    while the nodes of the fork are still running. It becomes consistent
    after the join. (CDAP-3000)
  • When running in CDAP Standalone mode, if a MapReduce job fails
    repeatedly, then the SDK hits an out-of-memory exception due to perm gen. The Standalone needs restarting at this point.
  • For Microsoft Windows, the CDAP Standalone scripts can fail when used
    with a JAVA_HOME that is defined as a path with spaces in it. A
    workaround is to use a definition of JAVA_HOME that does not include
    spaces, such as C:\PROGRA~1\Java\jdk1.7.0_79\bin or
  • In the CDAP CLI, executing select * from a dataset with many
    fields generates an error.
  • A RESTful API call to retrieve workflow statistics hangs if units
    (such as "s" for seconds) are not provided as part of the query.
  • If a table schema contains a field name that is a reserved word in the
    Hive DDL, 'enable explore' fails.
  • During the upgrade to CDAP 3.4.1, publishing to Kafka is halted
    because the CDAP Kafka service is not running. As a consequence, any
    applications that sync to the CDAP metadata will become out-of-sync as
    changes to the metadata made by the upgrade tool will not be published.