-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
item_is_latest always returns False #14
Comments
Nice job finding this. Definitely a bug. Not only does As far as I can tell, things did indeed work thus far because the pipeline.stdin queries of the tests all used
Since What's the specific case you're looking into now where the pipeline.stdin query is not using |
Thanks for confirming!
The new Crystal Genome framework allows for multiple equilibria for the
same symmetry:
https://openkim.org/id/TE_391353814247_001-and-MO_179420363944_002-tr
As you can see, the results.edn contains 4 instances, two of each property
(one energy and one structure). So every downstream dependency (e.g.
elastic constants) needs to loop over each distinct equilibrium and report
a separate property instance.
So, for example, I was using this query:
"query": {
"meta.type":"tr",
***@***.***
,2023-02-21:property/crystal-structure-npt",
"meta.subject.extended-id":MODELNAME,
"stoichiometric-species.source-value":{
"$size":{{stoichiometric_species|length}},
"$all":{{stoichiometric_species}}
},
"prototype-label.source-value": "{{prototype_label}}",
"cell-cauchy-stress.si-value": 0,
"temperature.si-value": 0
},
"fields":{
"a.si-value":1,
"parameter-names.source-value":1,
"parameter-values.source-value":1,
"library-prototype-label.source-value":1,
"short-name.source-value":1,
},
"database":"data", "limit":0, "flat":"on"
})
And because we are on version 002 of some of these tests, it is returning
duplicate results due to history=on, one duplicate for each test.
…On Mon, Apr 15, 2024 at 12:10 PM Daniel S. Karls ***@***.***> wrote:
Nice job finding this. Definitely a bug. Not only does ver == latest_ver
not make sense, but the future proofing I attempted to put in is mostly
just dead code anyway because we're using limit=1 in the query just above
it. This means that the only way a list will be returned is if it's an
empty list (no result was found).
As far as I can tell, things did indeed work thus far because the
pipeline.stdin queries of the tests all used limit: 1 like you say. A
typical one is something like the one from the grain boundary tests:
@< query({
"flat": "on",
"database": "data",
"fields": {"cohesive-potential-energy.si-value": 1, "a.si-value":1},
"project":["a.si-value", "cohesive-potential-energy.si-value"],
"limit": 1,
"query":{
"property-id":{"$regex":":property/cohesive-potential-energy-cubic-crystal$"},
"meta.subject.kimcode":MODELNAME,
"meta.runner.kimcode": {"$regex":"LatticeConstantCubicEnergy_bcc_Fe__TE_[0-9]{12}_[0-9]{3}"}
}
}) >@
Since util.item_is_latest always returned False, it led the test result
query intercept code in template.py
<https://github.com/openkim/developer-platform/blob/c94bae33af4f5bf435665e60107402f7e8646c86/docker/config/excerpts/template.py#L128-L148>
to always add history: 1 onto the queries. This didn't create a problem
w.r.t. the different models in the query results since the model version is
already fixed to a specific version in the query via MODELNAME. It didn't
cause a problem w.r.t. the runner version because that code always prepends
the sort with a descending sort on meta.runner.version. If the original
query had version-pinned the upstream dependency to a specific version,
then there should only be results for that exact test version coupled with
the exact model version. If, as in the typical case, the query just used a
regex to allow for all results computed using all versions of the upstream
test to be computed, then the intercept code is still always prepending a
sort on meta.runner.version, so we would always get the expected result
after throwing out all but the first one.
What's the specific case you're looking into now where the pipeline.stdin
query is not using limit: 1?
—
Reply to this email directly, view it on GitHub
<#14 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AS2TGGGEZW2SIM6UGWMNENDY5QC2FAVCNFSM6AAAAABGGSF6JWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJXGQZDCMZSGI>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Ok, thanks for the explanation -- it's a good thing we have access to HPC resources these days =P. Are you already fixing this? It should be as simple as just querying for the latest version of a short id and then checking its version against the current subject's version ( |
The multiple equilibria don't really tax our resources, since only a very
small minority of crystal prototypes have multiple equilibria. It's a
really annoying thorn in our side that we have to take care of for a small
minority of crystals -- this isn't the first complication it's caused :(
Anyway, I'm working on it. We could do what you suggest, but wouldn't it be
easier to just drop any comparisons period? Just query for the provided
short-id and see if it's 'latest'. I have to make an analogous change to
helper_functions.item_is_latest as well, and make sure it works, but more
importantly I have to change the upstream logic for if someone wants to run
a test on a stale model -- that query should return only the latest test
results for that stale model, but obviously even with the fix I described
above, `item_is_latest` rightfully returns False, and therefore it trips
the query to return the entire history including stale test results.
@brendonwaters @dskarls Is there analogous logic in the pipeline? I
couldn't find anything similar-looking through a text search, so I'm hoping
that it won't be an issue.
…On Mon, Apr 15, 2024 at 3:44 PM Daniel S. Karls ***@***.***> wrote:
Ok, thanks for the explanation -- it's a good thing we have access to HPC
resources these days =P. Are you already fixing this? It should be as
simple as just querying for the latest version of a short id and then
checking its version against the current subject's version (ver). The
only complicated bit might be if the subject lineage in question has been
"discontinued," in which case none of the versions of the subject in the
database will have latest=True (nor will any of their results). There
might be a design decision to be made in that case.
—
Reply to this email directly, view it on GitHub
<#14 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AS2TGGB5FGVRLARBPQZ2VK3Y5Q33VAVCNFSM6AAAAABGGSF6JWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJXG43TIOBTGI>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
I was thinking more in terms of downstream dependent tests that might repeat their computations for each equilibrium state they get from their pipeline.stdin.tpl queries. If you increase the number of results generated by a test by a factor A, then the number of calculations in each level of downstream dependents would increase by the same factor. Linearity usually doesn't sound bad but some of our tests (GrainBoundary, DislocationCore) can be pretty expensive.
Yeah, just do a query for the model extended ID and check 'latest' (if the query returns an empty list, return As far as running with a stale or discontinued model, the intercept_query function is appending a descending sort on In the pipeline, only the latest versions of things are generally run. The I assume this issue also applies to the |
Partially fixed with cc574d5 Works correctly for fresh models, I don't think stale models are a high priority right now, so I added a warning when a stale query is performed. |
@dskarls it seems to me that this code never worked, but it was masked by the fact that all previous tests always set
"limit":1
when querying for dependencies.return ver == latest_ver
just doesn't make any sense.latest_ver
is the result of a query for the boolean fieldlatest
, whilever
is the actual numerical version parsed from the kimcode. The function always returnsFalse
. Does that seem right?https://github.com/openkim/developer-platform/blame/c94bae33af4f5bf435665e60107402f7e8646c86/docker/config/excerpts/util.py#L199
The text was updated successfully, but these errors were encountered: