-
Notifications
You must be signed in to change notification settings - Fork 16
Running CoRB
There are a variety of ways to execute a CoRB job.
The entry point is the main method in the com.marklogic.developer.corb.Manager
class. CoRB requires the MarkLogic XCC JAR on the classpath (preferably the version that corresponds to the MarkLogic server version), which can be downloaded from https://developer.marklogic.com/products/xcc.
ml-gradle has a CorbTask that facilitates executing CoRB as a Gradle task. Refer to https://github.com/marklogic-community/ml-gradle/wiki/Corb-and-Gradle for an overview and specific instructions for configuring and executing CoRB as a gradle task.
CoRB needs options specified through one or more of the following mechanisms:
- command-line parameters
- Java system properties ex:
-DXCC-CONNECTION-URI=xcc://user:password@localhost:8202
- As properties file in the class path specified using
-DOPTIONS-FILE=myjob.properties
. Relative and full file system paths are also supported.
If specified in more than one place, a command line parameter takes precedence over a Java system property, which take precedence over a property from the OPTIONS-FILE properties file.
Note: Any or all of the properties can be specified as Java system properties or key value pairs in properties file.
Note: CoRB exit codes
0
- successful,0
- nothing to process (ref: EXIT-CODE-NO-URIS),1
- initialization or connection error and2
- execution error
Note: CoRB now supports Logging Job Metrics back to the MarkLogic database log and/or as document in the database.
java -server -cp .:marklogic-xcc-8.0.5.jar:marklogic-corb-2.4.5.jar
com.marklogic.developer.corb.Manager
XCC-CONNECTION-URI
[COLLECTION-NAME] [PROCESS-MODULE] [THREAD-COUNT] [URIS-MODULE] [MODULE-ROOT]
[MODULES-DATABASE] [INSTALL] [PROCESS-TASK] [PRE-BATCH-MODULE] [PRE-BATCH-TASK]
[POST-PROCESS-MODULE] [POST-BATCH-TASK] [EXPORT-FILE-DIR] [EXPORT-FILE-NAME]
[URIS-FILE]
java -server -cp .:marklogic-xcc-10.0.2.jar:marklogic-corb-2.4.6.jar
-DXCC-CONNECTION-URI=xcc://user:password@host:port/[ database ]
-DPROCESS-MODULE=module-name.xqy -DTHREAD-COUNT=10
-DURIS-MODULE=get-uris.xqy
-DPOST-BATCH-PROCESS-MODULE=post-batch.xqy
-D...
com.marklogic.developer.corb.Manager
java -server -cp .:marklogic-xcc-10.0.2.jar:marklogic-corb-2.4.6.jar
-DOPTIONS-FILE=job.properties com.marklogic.developer.corb.Manager
looks for job.properties file in classpath
java -server -cp .:marklogic-xcc-10.0.2.jar:marklogic-corb-2.4.6.jar
-DOPTIONS-FILE=job.properties -DTHREAD-COUNT=10
com.marklogic.developer.corb.Manager XCC-CONNECTION-URI
Note: any of the properties below can be specified as java system property i.e. '-D' option)
XCC-CONNECTION-URI=xcc://user:password@localhost:8202/
THREAD-COUNT=10
MODULE-ROOT=/temp/
MODULES-DATABASE=MY-Modules-DB
URIS-MODULE=get-uris.xqy
PROCESS-MODULE=transform.xqy
XCC-USERNAME=username
XCC-PASSWORD=password
XCC-HOSTNAME=localhost
XCC-PORT=9999
XCC-DBNAME=ML-database
THREAD-COUNT=10
MODULE-ROOT=/temp/
MODULES-DATABASE=MY-Modules-DB
URIS-MODULE=get-uris.xqy
PROCESS-MODULE=SampleCorbJob.xqy
XCC-CONNECTION-URI=xcc://user:password@localhost:8202/
THREAD-COUNT=10
MODULE-ROOT=/temp/
MODULES-DATABASE=MY-Modules-DB
URIS-FILE=input-uris.csv
PROCESS-MODULE=SampleCorbJob.xqy
XCC-CONNECTION-URI=xcc://user:password@localhost:8202/
THREAD-COUNT=10
MODULE-ROOT=/temp/
MODULES-DATABASE=MY-Modules-DB
XML-FILE=input.xml
XML-NODE=/rootNode/childNode
URIS-LOADER=com.marklogic.developer.corb.FileUrisXMLLoader
PROCESS-MODULE=SampleCorbJob.xqy
XCC-CONNECTION-URI=xcc://user:password@localhost:8202/
THREAD-COUNT=10
MODULE-ROOT=/temp/
MODULES-DATABASE=MY-Modules-DB
PROCESS-MODULE=get-data-from-document.xqy
PROCESS-TASK=com.marklogic.developer.corb.ExportBatchToFileTask
EXPORT-FILE-NAME=/local/path/to/exportmyfile.csv
PRE-BATCH-TASK=com.marklogic.developer.corb.PreBatchUpdateFileTask
EXPORT-FILE-TOP-CONTENT=col1,col2,col3
sample 7 - dynamic headers, assuming pre-batch-header.xqy module returns the header row, add the following to sample 4.
PRE-BATCH-MODULE=pre-batch-header.xqy
PRE-BATCH-TASK=com.marklogic.developer.corb.PreBatchUpdateFileTask
XCC-CONNECTION-URI=xcc://user:password@localhost:8202/
THREAD-COUNT=10
MODULE-ROOT=/temp/
MODULES-DATABASE=MY-Modules-DB
URIS-MODULE=get-uris.xqy
PROCESS-MODULE=transform.xqy
PRE-BATCH-MODULE=pre-batch.xqy
POST-BATCH-MODULE=post-batch.xqy
XQuery modules live local to filesystem where CORB is located. Any XQuery module can be adhoc.
XCC-CONNECTION-URI=xcc://user:password@localhost:8202/
THREAD-COUNT=10
MODULE-ROOT=/temp/
MODULES-DATABASE=MY-Modules-DB
URIS-MODULE=get-uris.xqy|ADHOC
PROCESS-MODULE=SampleCorbJob.xqy|ADHOC
PRE-BATCH-MODULE=/local/path/to/adhoc-pre-batch.xqy|ADHOC
XCC-CONNECTION-URI, XCC-USERNAME, XCC-PASSWORD, XCC-HOSTNAME, XCC-PORT and/or XCC-DBNAME properties can be encrypted and optionally enclosed by ENC(). If JASYPT-PROPERTIES-FILE is not specified, it assumes default jasypt.properties.
XCC-CONNECTION-URI=ENC(encrypted_uri)
DECRYPTER=com.marklogic.developer.corb.JasyptDecrypter
sample jasypt.properties
jasypt.password=foo
jasypt.algorithm=PBEWithMD5AndTripleDES
XCC-CONNECTION-URI, XCC-USERNAME, XCC-PASSWORD, XCC-HOSTNAME, XCC-PORT and/or XCC-DBNAME properties can be encrypted and optionally enclosed by ENC()
XCC-CONNECTION-URI=encrypted_uri
DECRYPTER=com.marklogic.developer.corb.PrivateKeyDecrypter
PRIVATE-KEY-FILE=/path/to/key/private.key
PRIVATE-KEY-ALGORITHM=RSA
XCC-CONNECTION-URI, XCC-USERNAME, XCC-PASSWORD, XCC-HOSTNAME, XCC-PORT and/or XCC-DBNAME properties can be encrypted and optionally enclosed by ENC()
XCC-CONNECTION-URI=encrypted_uri
DECRYPTER=com.marklogic.developer.corb.PrivateKeyDecrypter
PRIVATE-KEY-FILE=/path/to/rsa/key/rivate.pkcs8.key
MODULE-ROOT=/temp/
MODULES-DATABASE=MY-Modules-DB
URIS-MODULE=get-uris.sjs
PROCESS-MODULE=transform.sjs
URIS-MODULE=get-uris.sjs|ADHOC
PROCESS-MODULE=extract.sjs|ADHOC