Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/env variables #127

Draft
wants to merge 3 commits into
base: development
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,12 @@ You find an example in [`test/template-escape`](test/template-escape).
If you want to use for example `$(_name)` as both an external reference and a normal reference,
then you add a `\` for the latter resulting in `$(\_name)` for the latter.

If your YARRRML document contains sensitive information such as database credentials,
you can use dollar variables with curly braces and specify a variable in a `.env` file.
For example, use `${DB_PASSWORD}` in the YARRRML document and add a `.env` file with content `DB_PASSWORD=mySecretPassword`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we use normal brackets as well here? And check all variables if they are defined in the .env?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you mean $(DB_PASSWORD)?

This would clash with YARRRML templates: https://rml.io/yarrrml/spec/#template
Thus if you would (for some reason) specify value=myValue in your .env then also yarrrml templates $(value) would be replaced.
Similarly, if you would define VALUE=myValue in .env then a potential yarrrml template $(VALUE) would be replaced.

Such a clash with YARRRML templates is in my opinion not intended, because they reflect e.g. column names in a source.

Furthermore, curly braces (or no braces) are the common way to represent (environment) variables in UNIX, regular brackets are used for sub-shells: https://dev.to/rpalo/bash-brackets-quick-reference-4eh6

Copy link
Collaborator

@pheyvaer pheyvaer Aug 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or do the same as with the external references? Adding a _ before the variable name? so you would have then $(_DB_PASSWORD)? So external references can come from both the CLI and .env, which makes sense.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would technically work but would be a convention which has to be used just because of YARRRML and therefore would harm reusability/interoperability in cases where YARRRML is not the only component.

There might be other components using .env variables, for example a script which creates an SSH tunnel to a remote DB host behind a firewall, some UI component or any other application doing something.

IMHO YARRRML should integrate into an existing setup as easy as possible. A setup in which certain environment variables are already in place and are used by other systems. As a user I would not want to change my whole application the moment I want to integrate YARRRML mappings. This could be a potential pitfall preventing people from using YARRRML.

As I see it, reducing the expressibility of environment variables (only with prefixed underscore) just to combine it with external references (which already can be provided via parameter) would limit this feature.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You only have to change your YARRRML files, nothing else. The variables names remain the same. You only add the _ in the YARRRML template.

During generation of an RML document, the variables will be replaced.
If a variable is not found in the current environment, the yarrrml-parser will emit a warning.

If you want the outputted RML to be pretty, please provide the `-p` or `--pretty` parameter.

#### yarrrml-generator
Expand Down
3 changes: 3 additions & 0 deletions bin/parser.js
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@ const watch = require('../lib/watcher.js');
const glob = require('glob');
const Logger = require('../lib/logger');

// load environment variables from .env into process.env
require('dotenv').config();

namespaces.ql = 'http://semweb.mmlab.be/ns/ql#';

pkginfo(module, 'version');
Expand Down
29 changes: 28 additions & 1 deletion lib/abstract-generator.js
Original file line number Diff line number Diff line change
Expand Up @@ -48,8 +48,35 @@ class AbstractGenerator {

let json;

// replace found variables with the values from .env
let populatedYarrrml = yarrrml;
let reVars = /.*?(\${.*?}).*?/gm;
let usedEnvVariables = yarrrml.matchAll(reVars);
let nonFoundVars = [];
for(let match of usedEnvVariables) {
let found = match[1];
// extract var name, e.g. DB_HOST from ${DB_HOST} so we can compare with env variables
let varName = found.substring(2, (found.length - 1) );

if(varName in process.env) {
/* split-join solution because 'replaceAll' might not exist and 'replace' needs a global flag
* which is problematic because then a regex has to be used instead of a string
* and then our ${} variables are not recognized
* see Stackoverlow: https://stackoverflow.com/a/542305
*/
populatedYarrrml = populatedYarrrml.split(found).join(process.env[varName]);

} else {
nonFoundVars.push(varName);
}
}

if(nonFoundVars.length > 0) {
Logger.warn(`No value set for the following used environment variables: ${nonFoundVars}`);
}

try {
json = YAML.parse(yarrrml);
json = YAML.parse(populatedYarrrml);
} catch (e) {
e.code = 'INVALID_YAML';
e.file = file
Expand Down
42 changes: 42 additions & 0 deletions lib/rml-generator.test.js
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,48 @@ describe('YARRRML to RML', function () {
work('template-escape/mapping.yml', 'template-escape/mapping.rml.ttl', done, {includeMetadata: false});
});



describe('environment variables', function() {
let envCache;

beforeEach(() => {
// cache current environment to reset after tests
envCache = Object.assign({}, process.env);
});

afterEach(() => {
// reset environment
process.env = envCache;
});

it('works for expanded environment variables', function(done) {
process.env.DB_USER = "dbUser";
process.env.DB_PASSWORD = "dbPassword";
process.env.DB_HOST = "myHost";
process.env.DB_PORT = "myPort";
process.env.DB = "myDB";
work('env-variables/env-replaced/mapping.yml', 'env-variables/env-replaced/mapping.rml.ttl', done, {includeMetadata: true});
});

it('has warning if no environment variable was replaced', () => {
const y2r = new convertYAMLtoRML();

y2r.convert(fs.readFileSync(path.resolve(__dirname, '../test/env-variables/env-not-defined-warning-all/mapping.yml'), 'utf8'));
assert.strictEqual(y2r.getLogger().has('warning'), true);
assert.strictEqual(y2r.getLogger().getAll().length, 1);
});

it('has warning if only a few environment variables were not replaced', () => {
process.env.DB_HOST = "myHost";
process.env.DB_PORT = "myPort";
const y2r = new convertYAMLtoRML();
y2r.convert(fs.readFileSync(path.resolve(__dirname, '../test/env-variables/env-not-defined-warning/mapping.yml'), 'utf8'));
assert.strictEqual(y2r.getLogger().has('warning'), true);
assert.strictEqual(y2r.getLogger().getAll().length, 1);
});
})

describe('between our worlds rules', function () {
it('anime', function (done) {
work('betweenourworlds/anime/mapping.yarrrml', 'betweenourworlds/anime/mapping.rml.ttl', done, {includeMetadata: false});
Expand Down
31 changes: 31 additions & 0 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
],
"license": "MIT",
"dependencies": {
"dotenv": "^10.0.0",
"@rdfjs/serializer-jsonld-ext": "^2.0.0",
"commander": "^8.3.0",
"extend": "^3.0.2",
Expand Down
51 changes: 51 additions & 0 deletions test/env-variables/env-not-defined-warning-all/mapping.rml.ttl
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix fnml: <http://semweb.mmlab.be/ns/fnml#>.
@prefix fno: <https://w3id.org/function/ontology#>.
@prefix d2rq: <http://www.wiwiss.fu-berlin.de/suhl/bizer/D2RQ/0.1#>.
@prefix void: <http://rdfs.org/ns/void#>.
@prefix dc: <http://purl.org/dc/terms/>.
@prefix foaf: <http://xmlns.com/foaf/0.1/>.
@prefix rml: <http://semweb.mmlab.be/ns/rml#>.
@prefix ql: <http://semweb.mmlab.be/ns/ql#>.
@prefix : <http://mapping.example.com/>.
@prefix ex: <http://example.org/ns#>.

:rules_000 a void:Dataset;
void:exampleResource :map_myMapping_000.
:map_myMapping_000 rml:logicalSource :source_000.
:source_000 a rml:LogicalSource;
rml:source :database_000;
rml:query "SELECT val1, val2 FROM tab".
:database_000 a d2rq:Database;
d2rq:jdbcDSN "//${DB_HOST}:${DB_PORT}/${DB}";
d2rq:jdbcDriver "org.postgresql.Driver";
d2rq:username "${DB_USER}";
d2rq:password "${DB_PASSWORD}".
:source_000 rml:referenceFormulation ql:CSV.
:map_myMapping_000 a rr:TriplesMap;
rdfs:label "myMapping".
:s_000 a rr:SubjectMap.
:map_myMapping_000 rr:subjectMap :s_000.
:s_000 rr:template "http://example.org/ns#entity_{val1}".
:pom_000 a rr:PredicateObjectMap.
:map_myMapping_000 rr:predicateObjectMap :pom_000.
:pm_000 a rr:PredicateMap.
:pom_000 rr:predicateMap :pm_000.
:pm_000 rr:constant rdf:type.
:pom_000 rr:objectMap :om_000.
:om_000 a rr:ObjectMap;
rr:constant "http://example.org/ns#Thing";
rr:termType rr:IRI.
:pom_001 a rr:PredicateObjectMap.
:map_myMapping_000 rr:predicateObjectMap :pom_001.
:pm_001 a rr:PredicateMap.
:pom_001 rr:predicateMap :pm_001.
:pm_001 rr:constant ex:title.
:pom_001 rr:objectMap :om_001.
:om_001 a rr:ObjectMap;
rml:reference "val2";
rr:termType rr:Literal;
rml:languageMap :language_000.
:language_000 rr:constant "en".
22 changes: 22 additions & 0 deletions test/env-variables/env-not-defined-warning-all/mapping.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
prefixes:
ex: "http://example.org/ns#"

variables:
credentials: &credentials
username: ${DB_USER}
password: ${DB_PASSWORD}

mappings:

myMapping:
sources:
- access: //${DB_HOST}:${DB_PORT}/${DB}
type: postgresql
credentials: *credentials
queryFormulation: sql2008
referenceFormulation: csv
query: SELECT val1, val2 FROM tab
s: ex:entity_$(val1)
po:
- [a, ex:Thing]
- [ex:title, $(val2), en~lang]
51 changes: 51 additions & 0 deletions test/env-variables/env-not-defined-warning/mapping.rml.ttl
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix fnml: <http://semweb.mmlab.be/ns/fnml#>.
@prefix fno: <https://w3id.org/function/ontology#>.
@prefix d2rq: <http://www.wiwiss.fu-berlin.de/suhl/bizer/D2RQ/0.1#>.
@prefix void: <http://rdfs.org/ns/void#>.
@prefix dc: <http://purl.org/dc/terms/>.
@prefix foaf: <http://xmlns.com/foaf/0.1/>.
@prefix rml: <http://semweb.mmlab.be/ns/rml#>.
@prefix ql: <http://semweb.mmlab.be/ns/ql#>.
@prefix : <http://mapping.example.com/>.
@prefix ex: <http://example.org/ns#>.

:rules_000 a void:Dataset;
void:exampleResource :map_myMapping_000.
:map_myMapping_000 rml:logicalSource :source_000.
:source_000 a rml:LogicalSource;
rml:source :database_000;
rml:query "SELECT val1, val2 FROM tab".
:database_000 a d2rq:Database;
d2rq:jdbcDSN "//myHost:myPort/${DB}";
d2rq:jdbcDriver "org.postgresql.Driver";
d2rq:username "${DB_USER}";
d2rq:password "${DB_PASSWORD}".
:source_000 rml:referenceFormulation ql:CSV.
:map_myMapping_000 a rr:TriplesMap;
rdfs:label "myMapping".
:s_000 a rr:SubjectMap.
:map_myMapping_000 rr:subjectMap :s_000.
:s_000 rr:template "http://example.org/ns#entity_{val1}".
:pom_000 a rr:PredicateObjectMap.
:map_myMapping_000 rr:predicateObjectMap :pom_000.
:pm_000 a rr:PredicateMap.
:pom_000 rr:predicateMap :pm_000.
:pm_000 rr:constant rdf:type.
:pom_000 rr:objectMap :om_000.
:om_000 a rr:ObjectMap;
rr:constant "http://example.org/ns#Thing";
rr:termType rr:IRI.
:pom_001 a rr:PredicateObjectMap.
:map_myMapping_000 rr:predicateObjectMap :pom_001.
:pm_001 a rr:PredicateMap.
:pom_001 rr:predicateMap :pm_001.
:pm_001 rr:constant ex:title.
:pom_001 rr:objectMap :om_001.
:om_001 a rr:ObjectMap;
rml:reference "val2";
rr:termType rr:Literal;
rml:languageMap :language_000.
:language_000 rr:constant "en".
22 changes: 22 additions & 0 deletions test/env-variables/env-not-defined-warning/mapping.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
prefixes:
ex: "http://example.org/ns#"

variables:
credentials: &credentials
username: ${DB_USER}
password: ${DB_PASSWORD}

mappings:

myMapping:
sources:
- access: //${DB_HOST}:${DB_PORT}/${DB}
type: postgresql
credentials: *credentials
queryFormulation: sql2008
referenceFormulation: csv
query: SELECT val1, val2 FROM tab
s: ex:entity_$(val1)
po:
- [a, ex:Thing]
- [ex:title, $(val2), en~lang]
51 changes: 51 additions & 0 deletions test/env-variables/env-replaced/mapping.rml.ttl
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix fnml: <http://semweb.mmlab.be/ns/fnml#>.
@prefix fno: <https://w3id.org/function/ontology#>.
@prefix d2rq: <http://www.wiwiss.fu-berlin.de/suhl/bizer/D2RQ/0.1#>.
@prefix void: <http://rdfs.org/ns/void#>.
@prefix dc: <http://purl.org/dc/terms/>.
@prefix foaf: <http://xmlns.com/foaf/0.1/>.
@prefix rml: <http://semweb.mmlab.be/ns/rml#>.
@prefix ql: <http://semweb.mmlab.be/ns/ql#>.
@prefix : <http://mapping.example.com/>.
@prefix ex: <http://example.org/ns#>.

:rules_000 a void:Dataset;
void:exampleResource :map_myMapping_000.
:map_myMapping_000 rml:logicalSource :source_000.
:source_000 a rml:LogicalSource;
rml:source :database_000;
rml:query "SELECT val1, val2 FROM tab".
:database_000 a d2rq:Database;
d2rq:jdbcDSN "//myHost:myPort/myDB";
d2rq:jdbcDriver "org.postgresql.Driver";
d2rq:username "dbUser";
d2rq:password "dbPassword".
:source_000 rml:referenceFormulation ql:CSV.
:map_myMapping_000 a rr:TriplesMap;
rdfs:label "myMapping".
:s_000 a rr:SubjectMap.
:map_myMapping_000 rr:subjectMap :s_000.
:s_000 rr:template "http://example.org/ns#entity_{val1}".
:pom_000 a rr:PredicateObjectMap.
:map_myMapping_000 rr:predicateObjectMap :pom_000.
:pm_000 a rr:PredicateMap.
:pom_000 rr:predicateMap :pm_000.
:pm_000 rr:constant rdf:type.
:pom_000 rr:objectMap :om_000.
:om_000 a rr:ObjectMap;
rr:constant "http://example.org/ns#Thing";
rr:termType rr:IRI.
:pom_001 a rr:PredicateObjectMap.
:map_myMapping_000 rr:predicateObjectMap :pom_001.
:pm_001 a rr:PredicateMap.
:pom_001 rr:predicateMap :pm_001.
:pm_001 rr:constant ex:title.
:pom_001 rr:objectMap :om_001.
:om_001 a rr:ObjectMap;
rml:reference "val2";
rr:termType rr:Literal;
rml:languageMap :language_000.
:language_000 rr:constant "en".
Loading