Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

item_is_latest always returns False #14

Open
ilia-nikiforov-umn opened this issue Apr 15, 2024 · 6 comments
Open

item_is_latest always returns False #14

ilia-nikiforov-umn opened this issue Apr 15, 2024 · 6 comments
Assignees

Comments

@ilia-nikiforov-umn
Copy link
Contributor

ilia-nikiforov-umn commented Apr 15, 2024

@dskarls it seems to me that this code never worked, but it was masked by the fact that all previous tests always set "limit":1 when querying for dependencies.

return ver == latest_ver just doesn't make any sense. latest_ver is the result of a query for the boolean field latest, while ver is the actual numerical version parsed from the kimcode. The function always returns False. Does that seem right?

https://github.com/openkim/developer-platform/blame/c94bae33af4f5bf435665e60107402f7e8646c86/docker/config/excerpts/util.py#L199

@ilia-nikiforov-umn ilia-nikiforov-umn self-assigned this Apr 15, 2024
@dskarls
Copy link
Contributor

dskarls commented Apr 15, 2024

Nice job finding this. Definitely a bug. Not only does ver == latest_ver not make sense, but the future proofing I attempted to put in is mostly just dead code anyway because we're using limit=1 in the query just above it. This means that the only way a list will be returned is if it's an empty list (no result was found).

As far as I can tell, things did indeed work thus far because the pipeline.stdin queries of the tests all used limit: 1 like you say. A typical one is something like the one from the grain boundary tests:

@< query({
  "flat": "on",
  "database": "data",
  "fields": {"cohesive-potential-energy.si-value": 1, "a.si-value":1},
  "project":["a.si-value", "cohesive-potential-energy.si-value"],
  "limit": 1,
  "query":{
    "property-id":{"$regex":":property/cohesive-potential-energy-cubic-crystal$"},
    "meta.subject.kimcode":MODELNAME,
    "meta.runner.kimcode": {"$regex":"LatticeConstantCubicEnergy_bcc_Fe__TE_[0-9]{12}_[0-9]{3}"}
  }
}) >@

Since util.item_is_latest always returned False, it led the test result query intercept code in template.py to always add history: 1 onto the queries. This didn't create a problem w.r.t. the different models in the query results since the model version is already fixed to a specific version in the query via MODELNAME. It didn't cause a problem w.r.t. the runner version because that code always prepends the sort with a descending sort on meta.runner.version. If the original query had version-pinned the upstream dependency to a specific version, then there should only be results for that exact test version coupled with the exact model version. If, as in the typical case, the query just used a regex to allow for all results computed using all versions of the upstream test to be computed, then the intercept code is still always prepending a sort on meta.runner.version, so we would always get the expected result after throwing out all but the first one.

What's the specific case you're looking into now where the pipeline.stdin query is not using limit: 1?

@ilia-nikiforov-umn
Copy link
Contributor Author

ilia-nikiforov-umn commented Apr 15, 2024 via email

@dskarls
Copy link
Contributor

dskarls commented Apr 15, 2024

Ok, thanks for the explanation -- it's a good thing we have access to HPC resources these days =P. Are you already fixing this? It should be as simple as just querying for the latest version of a short id and then checking its version against the current subject's version (ver). The only complicated bit might be if the subject lineage in question has been "discontinued," in which case none of the versions of the subject in the database will have latest=True (nor will any of their results). There might be a design decision to be made in that case.

@ilia-nikiforov-umn
Copy link
Contributor Author

ilia-nikiforov-umn commented Apr 17, 2024 via email

@dskarls
Copy link
Contributor

dskarls commented Apr 17, 2024

The multiple equilibria don't really tax our resources, since only a very
small minority of crystal prototypes have multiple equilibria. It's a
really annoying thorn in our side that we have to take care of for a small
minority of crystals -- this isn't the first complication it's caused :(

I was thinking more in terms of downstream dependent tests that might repeat their computations for each equilibrium state they get from their pipeline.stdin.tpl queries. If you increase the number of results generated by a test by a factor A, then the number of calculations in each level of downstream dependents would increase by the same factor. Linearity usually doesn't sound bad but some of our tests (GrainBoundary, DislocationCore) can be pretty expensive.

Anyway, I'm working on it. We could do what you suggest, but wouldn't it be
easier to just drop any comparisons period? Just query for the provided
short-id and see if it's 'latest'. I have to make an analogous change to
helper_functions.item_is_latest as well, and make sure it works, but more
importantly I have to change the upstream logic for if someone wants to run
a test on a stale model -- that query should return only the latest test
results for that stale model, but obviously even with the fix I described
above, item_is_latest rightfully returns False, and therefore it trips
the query to return the entire history including stale test results.

Yeah, just do a query for the model extended ID and check 'latest' (if the query returns an empty list, return True). Looks like helper_functions.item_is_latest is more complicated since currently no 'obj' collection is maintained in the local database, but are you sure that's also broken?

As far as running with a stale or discontinued model, the intercept_query function is appending a descending sort on meta.runner.version and created_on (also I think maybe the -1 on L141 might need to be a 1 instead?), but I guess it's still going to dump all of those results into the rendered pipeline.stdin. On the surface, it seems this could be fixed with postprocessing to look at the highest runner version contained in the results, but you also need to think about created_on in case there are multiple results for a given runner version (definitely possible when using the local database and, in principle, the production database although unlikely). I guess it's also possible that a test might output one property instance from a run but not another; I would ignore this possibility.

In the pipeline, only the latest versions of things are generally run. The pipeline-run-pair endpoint was the one exception, where you could request specific runner and subject versions are run together, but I'm not sure if that was properly handling this scenario either. It looks like the same logic is present in the pipeline.stdin.tpl rendering code.

I assume this issue also applies to the get_test_result function provided in the template language.

@ilia-nikiforov-umn
Copy link
Contributor Author

Partially fixed with cc574d5

Works correctly for fresh models, I don't think stale models are a high priority right now, so I added a warning when a stale query is performed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants