-
Notifications
You must be signed in to change notification settings - Fork 31
Sub command: Custom
Mock Data tool is designed with mocking tables based on the datatype of a column, it's not smart in determining if that is a name column or a email column etc. With custom
sub command mock data tool provides the control to the user and lets the user decide the lifecycle of mocking the data to the tables, i.e
- User can pick which column to skip and let mock data tool decide the best data for it
- User can control what kind of data goes to a column i.e user can feed in custom dataset to mock ( i.e picked randomly during mocking )
- User can select from the list of supported realistic data key
NOTE: For all the realistic key, checkout the page
Under the custom
subcommand the user is provided with a file and a plan of how the data will be loaded to the columns, the file can then be modified and fed to the tool to control the dataset to mock.
Short Hand: The short hand of the schema subcommand is c
There are 3 ways to load data using the custom tool,
- User provided dataset
- Realistic dataset
- Random dataset
The order of selection (in case two or more option is set for a column) of what kind of data to be used to mock the table is determined by the order mentioned above i.e user generated dataset is give preference over realistic dataset etc.
The usage of table subcommand is
[gpadmin@gpdb-m ~]$ mock custom --help
Control the data being written to the tables
Usage:
mock custom [flags]
Aliases:
custom, c
Flags:
-f, --file string Mock the tables provided in the yaml file
-h, --help help for custom
-t, --table-name string Provide the table name whose skeleton need to be copied to the file
Global Flags:
-a, --address string Hostname where the postgres database lives
-d, --database string Database to mock the data (default "gpadmin")
-q, --dont-prompt Run without asking for confirmation
-i, --ignore Ignore checking and fixing constraints
-w, --password string Password for the user to connect to database
-p, --port int Port number of the postgres database (default 3000)
-r, --rows int Total rows to be faked or mocked (default 10)
-u, --username string Username to connect to the database
-v, --verbose Enable verbose or debug logging
As indicated above, you have choice of three ways to control the data to be loaded onto a table, click below if you want to quickly jump to the one you are interested
- Lets take a example of table that has a check constraint ( for eg.s partition in greenplum database or create have your own postgres database tables)
- Now lets build a plan of this table
mock custom --table-name sales -- OR -- mock c -t sales
- If the table is not on the default public schema then use
mock c -t <schema-name>.<table-name>
- If you want to generate plan for multiple table then use
mock c -t <schema-name1>.<table-name1>,<schema-name2>.<table-name2>...<schema-nameN>.<table-nameN>
- If the table is not on the default public schema then use
- Once the plan is generated you will received the location and yaml file at the end
The YAML is saved to file: <PATH>/<FILENAME>
- Edit the file generated using any text editor of your choice
- On the column you want to take control add array of value you would like to mock data to randomly pick under the
UserData
key, for eg we take control of date column belowCustom: - Schema: public Table: sales Column: - Name: id Type: integer UserData: [] Realistic: "" - Name: date Type: date UserData: - 2016-01-01 - 2016-03-01 - 2016-04-01 Realistic: "" - Name: amt Type: numeric(10,2) UserData: [] Realistic: ""
- Continue this procedure for the rest of the columns you are interested
- On the column you want to take control add array of value you would like to mock data to randomly pick under the
- Using the custom generated plan, feed the yaml to the mock tool
mock custom --file <filename or path/filename> -- OR -- mock c -f <filename or path/filename>
- If you want more rows use the row flag
mock custom --file <filename or path/filename> --row <total rows number> -- OR -- mock c -f <filename or path/filename> -r <total rows number>
- Lets create a table eg.s
CREATE TABLE employee ( name VARCHAR(100), email VARCHAR(120), mobile VARCHAR(50), gender VARCHAR(2), address VARCHAR(500) );
- Let's generate a plan for the table
mock custom --table-name employee -- OR -- mock c -t employee
- Edit the yaml generated using the above command to include realistic keys like below, for the complete list of realistic keys available check out this part of the code available here
Custom: - Schema: public Table: employee Column: - Name: name Type: character varying(100) UserData: [] Realistic: "NameFullName" - Name: email Type: character varying(120) UserData: [] Realistic: "InternetEmail" - Name: mobile Type: character varying(50) UserData: [] Realistic: "PhoneNumberString" - Name: gender Type: character varying(2) UserData: [] Realistic: "NameGenderAbbrev" - Name: address Type: character varying(500) UserData: [] Realistic: "AddressString"
- Using the custom generated plan, feed the yaml to the mock tool
mock custom --file <filename or path/filename> -- OR -- mock c -f <filename or path/filename>
If you combine all the three i.e power of random generated data / user provided & realistic you can have N possibilities of loading the data, let's take a example
-
Let us create a table
CREATE TABLE employee ( name VARCHAR(100), password_hash VARCHAR(30), gender VARCHAR );
-
Let's generate a plan for the table
mock custom --table-name employee -- OR -- mock c -t employee
-
Edit the yaml generated using the above command, here we will use
-
name
column will be fed by realistic data -
password_hash
column will be generated randomly by the tool -
gender
column will be inserted by user generated dataset
so our yaml now looks like
Custom: - Schema: public Table: employee Column: - Name: name Type: character varying(100) UserData: [] Realistic: "NameFullName" - Name: password_hash Type: character varying(30) UserData: [] Realistic: "" - Name: gender Type: character varying UserData: ["M", "F", "O"] Realistic: ""
-
-
Using the custom generated plan, feed the yaml to the mock tool
mock custom --file <filename or path/filename> -- OR -- mock c -f <filename or path/filename>