-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#2172 - update hermes json files #2173
Changes from 7 commits
8b89100
314a6b2
007dd05
2d3d228
7c9486d
b2e01ca
0f21273
92d983d
25d9728
0f005fe
125af38
9b97eb2
726d29a
95a355c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,9 +2,9 @@ | |
{ | ||
"vars": { | ||
"spark-submit": "spark-submit --num-executors 2 --executor-memory 2G --deploy-mode client", | ||
"spark-conf": "--conf 'spark.driver.extraJavaOptions=-Denceladus.rest.uri=http://localhost:8080/rest_api/api -Dspline.mongodb.name=spline -Dspline.mongodb.url=mongodb://127.0.0.1:27017/ -Denceladus.recordId.generation.strategy=stableHashId'", | ||
"enceladus-job-jar": "spark-jobs/target/spark-jobs-2.19.0-SNAPSHOT.jar", | ||
"credentials": "--rest-api-credentials-file ~/.ssh/menasCredential.properties", | ||
"spark-conf": "--conf 'spark.driver.extraJavaOptions=-Denceladus.rest.uri=http://localhost:8080/rest_api/api -Denceladus.recordId.generation.strategy=stableHashId'", | ||
"enceladus-job-jar": "spark-jobs/target/spark-jobs-3.0.0-SNAPSHOT.jar", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wonder if this somehow could be integrated with the current version. 🤔 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added as point for improvement in Issue #2168 |
||
"credentials": "--rest-api-credentials-file ~/.ssh/menas-credential.properties", | ||
"ref-std-data-path": "/ref/coalesceConformanceRule/std", | ||
"new-std-data-path": "/tmp/conformance-output/standardized-coalesceConformanceRule-1-2020-03-23-1", | ||
"ref-publish-data-path": "/ref/coalesceConformanceRule/publish", | ||
|
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -1,10 +1,10 @@ | ||||||
|
||||||
{ | ||||||
"vars": { | ||||||
"spark-submit": "spark-submit --num-executors 2 --executor-memory 2G --deploy-mode client", | ||||||
"spark-submit": "spark-submit --num-executors 2 --executor-memory 2G --deploy-mode client --conf spark.sql.parquet.datetimeRebaseModeInRead=LEGACY --conf spark.sql.parquet.datetimeRebaseModeInWrite=LEGACY", | ||||||
"spark-conf": "--conf 'spark.driver.extraJavaOptions=-Denceladus.rest.uri=http://localhost:8080/rest_api/api -Dspline.mongodb.name=spline -Dspline.mongodb.url=mongodb://127.0.0.1:27017/ -Denceladus.recordId.generation.strategy=stableHashId'", | ||||||
"enceladus-job-jar": "spark-jobs/target/spark-jobs-2.19.0-SNAPSHOT.jar", | ||||||
"credentials": "--rest-api-credentials-file ~/.ssh/menasCredential.properties", | ||||||
"enceladus-job-jar": "spark-jobs/target/spark-jobs-3.0.0-SNAPSHOT.jar", | ||||||
"credentials": "--rest-api-credentials-file ~/.ssh/menas-credential.properties", | ||||||
"ref-std-data-path": "/ref/std_nt_dn/std", | ||||||
"new-std-data-path": "/tmp/conformance-output/standardized-std_nt_dn-1-2019-11-27-1", | ||||||
"results-log-path": "/std/std_nt_dn/results", | ||||||
|
@@ -15,7 +15,7 @@ | |||||
"pluginName" : "BashPlugin", | ||||||
"name": "Standardization", | ||||||
"order" : 0, | ||||||
"args" : ["#{spark-submit}# #{spark-conf}# --class za.co.absa.enceladus.standardization.StandardizationJob #{enceladus-job-jar}# #{credentials}# #{dataset}# --raw-format csv --header true "], | ||||||
"args" : ["#{spark-submit}# #{spark-conf}# --class za.co.absa.enceladus.standardization.StandardizationJob #{enceladus-job-jar}# #{credentials}# #{dataset}# --raw-format json"], | ||||||
"writeArgs": [] | ||||||
}, | ||||||
{ | ||||||
|
@@ -30,7 +30,7 @@ | |||||
"pluginName" : "DatasetComparison", | ||||||
"name": "DatasetComparison", | ||||||
"order" : 1, | ||||||
"args" : ["--format", "parquet", "--new-path", "#{new-std-data-path}#", "--ref-path", "#{ref-std-data-path}#", "--keys", "name" ], | ||||||
"args" : ["--format", "parquet", "--new-path", "#{new-std-data-path}#", "--ref-path", "#{ref-std-data-path}#", "--keys", "property", "--datetimeRebaseMode", "LEGACY"], | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
I think we said this is not needed anymore There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We spoke about it, but my experiments confirmed that for four std json files this settings needs to be there. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So we updated Hermes to include We removed the opting afaik - AbsaOSS/hermes#132 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You are right, I was confused. |
||||||
"writeArgs": ["--out-path", "#{results-log-path}#/stdDataDiff"], | ||||||
"dependsOn": "Standardization" | ||||||
} | ||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change? Seems like personal setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to see it as small improvement to reach same pattern where CamelCase is used for Class files only. I am used to write these non-class files in format "word-word" or "word_word". I can see more similar locations with this naming. Are we following some format rules when to use CamelCase, w-w and w_w?