Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: separate structure and profile in tables.yaml #337

Open
adrienaury opened this issue Dec 9, 2024 · 0 comments · May be fixed by #338
Open

feat: separate structure and profile in tables.yaml #337

adrienaury opened this issue Dec 9, 2024 · 0 comments · May be fixed by #338
Assignees
Labels
enhancement New feature or request

Comments

@adrienaury
Copy link
Member

adrienaury commented Dec 9, 2024

Problem

tables.yaml file mixes 2 types of informations

  1. information about the datasource structure (tables names, primary keys, dbinfos types)
  2. information about extract or load operations (list of columns, export format, import format)

This example :

version: "1"
tables:
- name: "film"
    keys: ["film_id"]
    columns:
      - name: "film_id"
        dbinfo:
          type: "bigserial"
      - name: "title"
        dbinfo:
          type: "varchar"
          length: 30
          bytes: true
      - name: "picture"
        export: "presence"
        import: "file"
        dbinfo:
          type: "BLOB"

Contains information about the datasource structure (tables names, primary keys, dbinfos types) :

version: "1"
tables:
- name: "film"
    keys: ["film_id"]
    columns:
      - name: "film_id"
        dbinfo:
          type: "bigserial"
      - name: "title"
        dbinfo:
          type: "varchar"
          length: 30
          bytes: true
      - name: "picture"
        dbinfo:
          type: "BLOB"

And information about extract or load operations (list of columns to export, export formats, import formats) :

version: "1"
tables:
- name: "film"
    columns:
      - name: "film_id"
      - name: "title"
      - name: "picture"
        export: "presence"
        import: "file"

There is a difference between each type of information

  1. information about the datasource structure never change
  2. information about extract or load operations will vary depending on the use case

Therefore, it would be interresting to separate these concerns in different files.

Solution

This does not impact existing configurations.

Information about extract or load operations should be managed by the existing ingress-descriptor configuration. This configuration is loaded by the pull and push command via the existing flag : --ingress-descriptor<filename> or -i <filename>.

Ingress descriptor file already manage list of columns to select. The only missing information to complete extract/load operations is the import/export formats.

When using the --ingress-descriptor flag, import/export formats contained inside the ingress-descriptor file will be overriding informations loaded from the root table.yaml file. This is for retro-compatibility with current behavior.

The previous exemple could be configured like this :

tables.yaml

version: "1"
tables:
- name: "film"
    keys: ["film_id"]
    columns:
      - name: "film_id"
        dbinfo:
          type: "bigserial"
      - name: "title"
        dbinfo:
          type: "varchar"
          length: 30
          bytes: true
      - name: "picture"
        dbinfo:
          type: "BLOB"

ingress-descriptor.yaml

version: v1
IngressDescriptor:
    startTable: "film"
    select: ["film_id", "title", "picture"]
    formats:
      - columns: "picture"
        export: "presence"
        import: "file"

The following command would extract data with list of columns to export and export formats defined in ingress-descriptor.yaml

$ lino pull source --ingress-descriptor ingress-descriptor.yaml

The following command would load data with list of columns to import and importformats defined in ingress-descriptor.yaml

$ lino push source --ingress-descriptor ingress-descriptor.yaml
@adrienaury adrienaury self-assigned this Dec 9, 2024
@adrienaury adrienaury added the enhancement New feature or request label Dec 9, 2024
@adrienaury adrienaury linked a pull request Dec 9, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant