Granular Control Of result_format #4899
Replies: 2 comments 1 reply
-
Hey @greg-meinecke ! Thanks for surfacing this. We'll review this functionality internally and get back with you soon! |
Beta Was this translation helpful? Give feedback.
-
Hey @greg-meinecke, thanks for requesting this feature. In our company and for our customers, we would appreciate a solution for this inside GX. @austiezr what is the current state here? In addition to the above request, I would like to showcase what we are currently facing: We are successfully validating a wide range of objects so far with the pandas' engine now for more than 2 years. With the growing number of objects, we're also confronted with larger tables that cannot be spit or filtered in a useful way. With those larger tables, we get issues with memory consumption of our service. Therefore, we are moving towards SQL engine, as all data is anyway in PostgreSQL DB. We figured out (version 0.17.9), that with the partial_unexpected_count (default =20) the unexpected_index_list and partial_unexpected_index_list are limited to what the count is set. But it does not constrain the unexpected_list, which is a problem on large tables with a high number of failures. All the failed values are all stored in the result JSON, which can grow to GB size. In our case, we don't need the unexpected_list. Therefore, what we would like to have is a fine-grained control on which list is populated by GX. For our case, it would be the unexpected_index_list (count < 500) and the partial_unexpected_index_list (count 20-100) for use in data-docs. Additionally, we would like to have the return_unexpected_index_query always, as we have a custom action to collect the details later on demand based on the query. This is also especially helpful for us, if a larger number of details need to be collected, and we can offload the workload for this task to the PostgreSQL DB engine. I hope someone from the project can provide feedback on a more granular control of the validation result. |
Beta Was this translation helpful? Give feedback.
-
Currently the
result_format
parameter is limited toBOOLEAN_ONLY|BASIC|SUMMARY|COMPLETE
withBASIC
being the default. If I am interested in my validation returningunexpected_index_list
, I am forced to useresult_format:COMPLETE
and all of the fields that come with it (partial_unexpected_list
,unexpected_list
,partial_unexpected_index_list
,etc...). This unnecessarily increases the size of the validation result json.I would like to be able to choose the fields returned in the validation result, maybe something like
result_format_field_list
where one can pass a list of the fields they would like returned in the result, in my caseresult_format_field_list : ["unexpected_index_list"]
Beta Was this translation helpful? Give feedback.
All reactions