-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experiment Validator [Resolves #26] #239
Conversation
thcrock
commented
Oct 26, 2017
- Add ExperimentValidator class to hold all pre-run experiment validations, and populate with initial list of validations
- Add database_reflection and validation_primitives modules (moved from architect) to support validations
- Move experiment test utils to utils file so they are now usable in other tests
- Update README with validation information
Codecov Report
@@ Coverage Diff @@
## master #239 +/- ##
==========================================
- Coverage 80.67% 78.12% -2.55%
==========================================
Files 5 8 +3
Lines 326 608 +282
==========================================
+ Hits 263 475 +212
- Misses 63 133 +70
Continue to review full report at Codecov.
|
README.md
Outdated
[SQL: 'explain select * from cat_complaints'] | ||
``` | ||
|
||
If nothing is output, it means that we can't find anything wrong with the Experiment so far, and it should be ready to run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just from a user-friendliness perspective, might actually be nice to output a message saying that no validation errors were found.
triage/experiments/validate.py
Outdated
self.db_engine = db_engine | ||
|
||
|
||
class FeatureAggregationsValidator(Validator): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is most of this just moving from architect.FeatureGenerator? If so, we should probably include the imputation validation here, too (and probably a reasonable time to go ahead and address this issue while we're at it). Also, is there an architect PR to remove from there as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, lots of things in here are just moving from Architect. It's easy enough to move the imputation stuff from Architect.
The collate issue seems like it will be less needed post-merge.
Removing things from architect I figured we would do after this PR. Though now, with the code merge, I'm not sure if it's worth it. Maybe architect could keep its validations as it rides off into the sunset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense on holding off on removing things from architect, though I think the collate issue will still be important even post-monolith in terms of making things a little bit more clear and easier to edit so you only have to do so in one place in the code (even if the monolith will at least mean you only have to edit one repo regardless).
triage/experiments/validate.py
Outdated
def _validate_categoricals(self, categoricals): | ||
conn = self.db_engine.connect() | ||
for categorical in categoricals: | ||
if 'choice_query' in categorical: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If choice_query
isn't specified should we validate that choices
is (and, likewise, that one and only one of those is given)?
|
||
class StateConfigValidator(Validator): | ||
def run(self, state_config): | ||
if 'table_name' in state_config: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe warn that will be falling back to default behavior if table_name
isn't specified?
triage/experiments/validate.py
Outdated
raise ValueError('''Unknown feature_group_definition key {} received. | ||
Available keys are {}'''.format(subsetter_name, available_subsetters)) | ||
if not hasattr(value, '__iter__') or isinstance(value, (str, bytes)): | ||
raise ValueError('Each value in FeatureGroupCreator must be iterable and not a string') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure someone just using a triage config is going to know what to do with this error message.
triage/experiments/validate.py
Outdated
if not hasattr(value, '__iter__') or isinstance(value, (str, bytes)): | ||
raise ValueError('Each value in FeatureGroupCreator must be iterable and not a string') | ||
|
||
if 'prefix' in feature_group_definition: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like we validate specified prefixes but not if you specify tables?
triage/experiments/validate.py
Outdated
|
||
|
||
class ModelGroupKeysValidator(Validator): | ||
def run(self, model_group_keys, user_metadata): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there might be more we can do here around potential logical fallacies or places where the user might get unexpected results, at least with warnings. For instance, I can't think of a good reason not to have train_duration
as part of the model group keys, while at the same time, I can't think of a case where having matrix_end_time
in the keys would make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be good, but this seems it might be better to actually bar them from doing certain things, or auto-adding certain things. If there isn't a good reason not to have certain keys, should we just add them to the defaults?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, adding to defaults and disallowing certain things might make more sense than simply checking for those things on validation.
triage/experiments/validate.py
Outdated
Available keys are {}'''.format(model_group_key, available_keys)) | ||
|
||
|
||
class GridConfigValidator(Validator): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe at very least make sure that all of the model classes specified actually exist and run get_params()
on them to ensure that the specified parameters exist?
triage/experiments/validate.py
Outdated
pass | ||
|
||
|
||
class ScoringConfigValidator(Validator): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least ensure that the specified metrics exist and are properly defined with a metric decorator to specify whether greater is better?
triage/experiments/validate.py
Outdated
|
||
|
||
class ExperimentValidator(Validator): | ||
def run(self, experiment_config): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feels like we should probably check the config version here, too, right? though at least that will get caught pretty immediately when trying to run the experiment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That actually happens at instantiation time, so you could never even get to attempting to run it; or even this validation code, under the recommended code path (experiment.validate()
) . We could just remove it from there and put it here instead.
Went through here and added some inline comments. Generally looks good and I think this will definitely be a helpful addition. One thing that seems like it's missing, though, is a validation of the temporal config, which often feels like a bit source of logical errors, potential leakage problems, etc. And, given that timechop is generally fast to run, it might make sense to actually generate the time cuts and do some simple validation of the output (matrices with no as of dates? obvious leakage problems?) |
triage/database_reflection.py
Outdated
db_engine.execute('select * from {} limit 1'.format(table_name)) | ||
] | ||
|
||
return len(results) > 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a hugely important difference, but I believe you can safely load even less of the actual data here, if you just want to check that there are any rows in the table –
result = db_engine.execute(
'select 1 from {table_name} limit 1'
.format(table_name=table_name)
)
return any(result)
triage/experiments/validate.py
Outdated
logging.info('Validating time intervals') | ||
for interval in intervals: | ||
if interval != 'all': | ||
_ = convert_str_to_relativedelta(interval) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just curious, why the assignment? for pdb?
triage/experiments/validate.py
Outdated
_ = convert_str_to_relativedelta(interval) | ||
|
||
def _validate_groups(self, groups): | ||
if not any(group == 'entity_id' for group in groups): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this different from 'entity_id' in groups
?
triage/experiments/validate.py
Outdated
raise ValueError('''Aggregation prefix of '{}' was given as a | ||
feature group, but no such prefix exists in the available | ||
feature aggregations. The available feature aggregations are {} | ||
'''.format(prefix, available_prefixes)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another trivial suggestion, but if you're doing a bunch of [not] in
checks, a set
would usually be more performant.
and, I wonder if it might be nice to tell them up front all of the bad prefixes, along the lines of –
available_prefixes = {aggregation['prefix'] for aggregation in feature_aggregation_config}
bad_prefixes = set(feature_group_definition['prefix']) - available_prefixes
if bad_prefixes:
raise ValueError(….format(bad_prefixes, available_prefixes))
triage/experiments/validate.py
Outdated
for strategy in feature_group_strategies: | ||
if strategy not in available_strategies: | ||
raise ValueError('''Unknown feature_group_strategies key {} received. | ||
Available keys are {}'''.format(strategy, available_strategies)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(ditto)
011f047
to
44eb390
Compare
I added temporal config output validation, but it's very rudimentary. I'm not really sure what makes sense to put in there (what are obvious leakage problems)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
result = [ | ||
row for row in | ||
db_engine.execute('select 1 from {} limit 1'.format(table_name)) | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still seems like a superfluous list comprehension – it neither maps nor filters; and, I gather that the SQLAlchemy result object is iterable, and sufficient, (and even if we had to cast to list we could do so explicitly, i.e. list(results)
). …But, quite far from a big deal!
triage/experiments/validate.py
Outdated
|
||
if 'tables' in feature_group_definition: | ||
available_tables = { | ||
aggregation['prefix'] + '_aggregation' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, actually with the imputation branch merged, this should be aggregation['prefix'] + '_aggregation_imputed'
that said, looks like we actually need to fix the example of this in the example experiment config here as well if you don't mind doing that along with this PR.
Thanks, @thcrock! Three thoughts here:
|
- Add ExperimentValidator class to hold all pre-run experiment validations, and populate with initial list of validations - Add database_reflection and validation_primitives modules (moved from architect) to support validations - Move experiment test utils to utils file so they are now usable in other tests - Update README with validation information
… indicate success
baec991
to
2d0bef1
Compare
Thanks, @thcrock -- I think these generally look good to me and I'd be good with merging (of course, we can continue to add other validations as we go). Seems like pointing people to the |