-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Problem
tables.yaml file mixes 2 types of informations
- information about the datasource structure (tables names, primary keys, dbinfos types)
- information about extract or load operations (list of columns, export format, import format)
This example :
version: "1"
tables:
- name: "film"
keys: ["film_id"]
columns:
- name: "film_id"
dbinfo:
type: "bigserial"
- name: "title"
dbinfo:
type: "varchar"
length: 30
bytes: true
- name: "picture"
export: "presence"
import: "file"
dbinfo:
type: "BLOB"Contains information about the datasource structure (tables names, primary keys, dbinfos types) :
version: "1"
tables:
- name: "film"
keys: ["film_id"]
columns:
- name: "film_id"
dbinfo:
type: "bigserial"
- name: "title"
dbinfo:
type: "varchar"
length: 30
bytes: true
- name: "picture"
dbinfo:
type: "BLOB"And information about extract or load operations (list of columns to export, export formats, import formats) :
version: "1"
tables:
- name: "film"
columns:
- name: "film_id"
- name: "title"
- name: "picture"
export: "presence"
import: "file"There is a difference between each type of information
- information about the datasource structure never change
- information about extract or load operations will vary depending on the use case
Therefore, it would be interresting to separate these concerns in different files.
Solution
This does not impact existing configurations.
Information about extract or load operations should be managed by the existing ingress-descriptor configuration. This configuration is loaded by the pull and push command via the existing flag : --ingress-descriptor<filename> or -i <filename>.
Ingress descriptor file already manage list of columns to select. The only missing information to complete extract/load operations is the import/export formats.
When using the --ingress-descriptor flag, import/export formats contained inside the ingress-descriptor file will be overriding informations loaded from the root table.yaml file. This is for retro-compatibility with current behavior.
The previous exemple could be configured like this :
tables.yaml
version: "1"
tables:
- name: "film"
keys: ["film_id"]
columns:
- name: "film_id"
dbinfo:
type: "bigserial"
- name: "title"
dbinfo:
type: "varchar"
length: 30
bytes: true
- name: "picture"
dbinfo:
type: "BLOB"ingress-descriptor.yaml
version: v1
IngressDescriptor:
startTable: "film"
select: ["film_id", "title", "picture"]
formats:
- columns: "picture"
export: "presence"
import: "file"The following command would extract data with list of columns to export and export formats defined in ingress-descriptor.yaml
$ lino pull source --ingress-descriptor ingress-descriptor.yamlThe following command would load data with list of columns to import and importformats defined in ingress-descriptor.yaml
$ lino push source --ingress-descriptor ingress-descriptor.yaml