Update fetch_r_script_checksum to iterate over multiple results batches within a cursor #120
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In our experience using schemachange, we have run into an issue where fetch_r_script_checksum fails due to the quantity of R-scripts in our pipeline exceeding the data size that will fit within one Result Batch as returned by snowflake-connector-python. It will successfully iterate over all the rows in the first "chunk" but fail to proceed into a second chunk if it exists. The suggested code update resolves this by first returning all results batches and iterating through the rows within each batch. It does require pyarrow to be installed, which I've added to setup.cfg. The specified range of versions matches those required by snowflake-connector-python.
Note that I did discover the latest version of snowflake-connector-python (2.7.8) no longer fails with the existing fetch_r_script_checksum code as-written, so upgrading that config requirement may also work. That said I tested the suggested code update below with snowflake-connector-python 2.6.2, 2.7.4, and 2.7.8 and it works on all of them, so it would be more flexible and could be implemented without upgrading to 2.7.8 yet.