Skip to content

The dbt-spark-livy adapter allows you to use dbt along with Apache Spark, by connecting via Apache Livy

License

Notifications You must be signed in to change notification settings

cloudera/dbt-spark-livy

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

dbt-spark-livy

The dbt-spark-livy adapter allows you to use dbt along with Apache spark-livy and Cloudera Data Platform with Livy server support. This code bases use the dbt-spark project (https://github.com/dbt-labs/dbt-spark), and provides a Livy connectivity support over it.

Getting started

Running locally

A docker-compose environment starts a Spark Thrift server and a Postgres database as a Hive Metastore backend. Note: dbt-spark now supports Spark 3.1.1 (formerly on Spark 2.x).

Python >= 3.8

dbt-core ~= 1.3.0

pyspark

sqlparams

requests_kerberos

requests-toolbelt

python-decouple

Installing dbt-spark-livy

pip install dbt-spark-livy

Profile Setup

demo_project:
  target: dev
  outputs:
    dev:
     type: spark_livy
     method: livy
     schema: my_db
     host: https://spark-livy-gateway.my.org.com/dbt-spark/cdp-proxy-api/livy_for_spark3/
     user: my_user
     password: my_pass

Caveats

  • While using livy , in the Livy UI if you notice sessions change state to dead from starting instead of idle, make sure there is a proper mapping for the user in the IDBroker mapping section
  • Actions > Manage Access > IDBroker Mappings . Reference
  • Also make sure the workload password is set either through UI or CLI. Reference

Supported features

Please see the original adapter documentation: https://github.com/dbt-labs/dbt-spark and https://docs.getdbt.com/reference/warehouse-profiles/spark-profile

About

The dbt-spark-livy adapter allows you to use dbt along with Apache Spark, by connecting via Apache Livy

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 99.9%
  • Shell 0.1%