Performance Improvement: removed DB Schema validation from `update_if_valid()` #361

nh916 · 2023-09-20T22:38:15Z

Description

removed DB Schema validation from update_if_valid() to improve performance

Issue Link

https://trello.com/c/1wBnrbBw

Changes

removed DB Schema validation within update_if_valid method
- originally the tests took 54 minutes to run, after the change the tests took 13 minutes
changed the docstrings and code to reflect the new updates

Screenshots

click to see test performance before

click to see test performance after

Notes

was thinking of knocking out the DB schema checking through deserialization and serialization, but I don't think I have to because the performance is good where it is
I think this is good, but if we decide we need more or less validation later as we go, we can add it in

Checklist

My name is on the list of contributors (CONTRIBUTORS.md) in the pull request source branch.
I have updated the documentation to reflect my changes.

… A LOT * changed the docstrings and code to reflect the new updates

trunk-io · 2023-09-20T22:38:17Z

Merging to develop in this repository is managed by Trunk.

To merge this pull request, check the box to the left or comment /trunk merge below.

InnocentBug

Absolutely not a fan of this.
I won't approve.

Human time, working with the SDK is always more valuable then computer time.
Debugging any script with the SDK without this check is an absolute nightmare.
DB Schema error hard to understand in the first place for users, but if the error now pops up at an unexpected location and at a place where you can't figure out its origin is an absolute no go.

If you are really so concerned about CPU performance, make this a flag in the API class.
api.disable_db_schema_check and api.enable_db_schema_check.
And default is checks on.
If a user is done debugging their script and are concerned about performance they can disable the checks.

I also think that this is the wrong place to disable the checks, in my opinion the disabling should happen in api._is_node_schema_valid, hiding it in a node function that is called upon updating seems cryptic to me.

InnocentBug · 2023-09-21T13:21:22Z

Also this check should 100% be on for our unit tests.
Otherwise we don't even have a chance to catch issues with the DB schema changes.

nh916 · 2023-09-21T19:34:31Z

I also think that this is the wrong place to disable the checks, in my opinion the disabling should happen in api._is_node_schema_valid, hiding it in a node function that is called upon updating seems cryptic to me.

I do agree that the function looks a bit cryptic because we are saying "update if valid" and then we are just updating it regardless. I was thinking of first putting this idea out and seeing which parts we like and dislike and after we come to agreement we can figure out the rest of refactoring/renaming to make it more readable. I figured we'd probably disagree about this and need to find a middle ground.

I can do a flag, but then we won't have any db schema validation instead of having it in fewer places, which is not ideal.

Human time, working with the SDK is always more valuable then computer time.

I agree, but that is the issues because as the user is sitting trying to run their program, it takes forever for it to run and give back the error. If only uploads were slow then it would make sense, but human time is wasted here waiting for the computer to run and after a long time give an error or success.

I bet the errors will be equally as helpful in debugging if we have it in one place instead of many places.

We could also try to do an experiment or user testing for this as well to be more sure.

Debugging any script with the SDK without this check is an absolute nightmare.

I think it would actually be okay because we have some pretty good validation with beartype and our setters and getters are written nicely and the db schema gives pretty good error are pretty easy and understand and debug from.

Tell me more about your thoughts on what would be hard. Was there a time that you were struggling with the error and had a hard time debugging this SDK?

nh916 · 2023-09-21T19:36:57Z

Also this check should 100% be on for our unit tests. Otherwise we don't even have a chance to catch issues with the DB schema changes.

I think the API also runs the DB schema checks so we have another safety there and gives back the db schema errors.

We could also try to have the db schema run at the end of every unit tests, not sure how to do that yet, but if we like that idea maybe we can research more into it and go from there

InnocentBug · 2023-09-21T22:13:00Z

I can do a flag, but then we won't have any db schema validation instead of having it in fewer places, which is not ideal.

A flag is good.
We can write the validate function in a way that allows overwriting the flag.
And we would use that on SAVE.

It will be validated by API anyways, so forceful validation here should be fine.

InnocentBug · 2023-09-21T22:13:42Z

I think the API also runs the DB schema checks so we have another safety there and gives back the db schema errors.

Yes, but the feedback from API on DB schema error is really cryptic and I don't think it is effort well spend that we decode this message for the users.

InnocentBug · 2023-09-21T22:14:22Z

We could also try to have the db schema run at the end of every unit tests, not sure how to do that yet, but if we like that idea maybe we can research more into it and go from there

In the flag solution, I would just not set the flag for our unit tests.
(Except of course to test the flag itself.)

nh916 · 2023-09-21T23:10:45Z

I can do a flag, but then we won't have any db schema validation instead of having it in fewer places, which is not ideal.

A flag is good. We can write the validate function in a way that allows overwriting the flag. And we would use that on SAVE.

It will be validated by API anyways, so forceful validation here should be fine.

I can write the flag in and I'm thinking of doing it how I wrote the verbose property so it can be changed at any point throughout the script, but not sure right now how to only have db schema work on save and be absent everywhere else. Would you want to do that part after I put in the flag?

nh916 · 2023-09-21T23:11:40Z

We could also try to have the db schema run at the end of every unit tests, not sure how to do that yet, but if we like that idea maybe we can research more into it and go from there

In the flag solution, I would just not set the flag for our unit tests. (Except of course to test the flag itself.)

not sure if I'd want to write a test for this because it is too small and simple and a test for it might just be overkill

changed update_if_valid method and the test performance improved by…

51c1b4a

… A LOT * changed the docstrings and code to reflect the new updates

nh916 requested a review from InnocentBug September 20, 2023 22:39

InnocentBug requested changes Sep 21, 2023

View reviewed changes

InnocentBug mentioned this pull request Sep 29, 2023

Performance improvement for node validation #374

Closed

2 tasks

InnocentBug closed this Feb 20, 2024

InnocentBug deleted the performance-improvement branch March 27, 2024 13:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance Improvement: removed DB Schema validation from `update_if_valid()` #361

Performance Improvement: removed DB Schema validation from `update_if_valid()` #361

nh916 commented Sep 20, 2023 •

edited

Loading

trunk-io bot commented Sep 20, 2023

InnocentBug left a comment

InnocentBug commented Sep 21, 2023

nh916 commented Sep 21, 2023

nh916 commented Sep 21, 2023

InnocentBug commented Sep 21, 2023

InnocentBug commented Sep 21, 2023

InnocentBug commented Sep 21, 2023

nh916 commented Sep 21, 2023

nh916 commented Sep 21, 2023

Performance Improvement: removed DB Schema validation from update_if_valid() #361

Performance Improvement: removed DB Schema validation from update_if_valid() #361

Conversation

nh916 commented Sep 20, 2023 • edited Loading

Description

Issue Link

Changes

Screenshots

Notes

Checklist

trunk-io bot commented Sep 20, 2023

InnocentBug left a comment

Choose a reason for hiding this comment

InnocentBug commented Sep 21, 2023

nh916 commented Sep 21, 2023

nh916 commented Sep 21, 2023

InnocentBug commented Sep 21, 2023

InnocentBug commented Sep 21, 2023

InnocentBug commented Sep 21, 2023

nh916 commented Sep 21, 2023

nh916 commented Sep 21, 2023

Performance Improvement: removed DB Schema validation from `update_if_valid()` #361

Performance Improvement: removed DB Schema validation from `update_if_valid()` #361

nh916 commented Sep 20, 2023 •

edited

Loading