-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/relax and bands wc #254
base: master
Are you sure you want to change the base?
Conversation
the relaxation and subsequently the calculation of bands. It expose inputs of the `CommonRelaxInputGenerator` and makes them "non_db" so that the ports of `CommonRelaxInputGenerator` can be used direcly with no modifications.
Once an automatic generation of the kpoints path for bands is performed by SeeKpath, this might refere to a conventional structure that is different from the input one. For this reason it is necessary to run a further scf before calculating bands. The inputs for the further scf are now exposed. They are the inputs of a `CommonRelaxInpuGenerator`, but where the default relaxation_type is NONE and all the other defaults have been removed. If not defined by user, the same inputs of the first relaxation are used (except the new default for relaxation_type)
@sphuber I would like especially a feedback on how to handle defaults of the |
I will also now write few tests |
Why do you want to get rid of the defaults?
Nope, that is not what it is supposed to do. Setting |
This is exactly what I want. But unfortunately what I see is different. I set all the inputs to have |
But spec.expose_inputs(CommonRelaxInputGenerator, namespace='relax', exclude=('structure',),
namespace_options={'required': False, 'populate_defaults': False,
'help': 'Inputs for the `CommonRelaxInputGenerator`, if not specified at all, the relaxation step is skipped.'}) |
Ok, thanks @sphuber, I understand, but then this does not cover my use case, since I want one input in the namespace to be populated, and the others no. How do you do that? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @bosonie , have some comments on design and implementation
namspac = spec.inputs.create_port_namespace('relax_inputs') | ||
namspac.absorb(CommonRelaxInputGenerator.spec().inputs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should use expose_inputs
. If the namespace is fully optional, you set populate_defaults = False
:
spec.expose_inputs(
CommonRelaxInputGenerator,
namespace='relax',
exclude=('structure',),
namespace_options={'populate_defaults': False, 'required': False}
)
spec.expose_inputs(
CommonRelaxInputGenerator,
namespace='scf',
exclude=('structure',),
namespace_options={'populate_defaults': False, 'required': False}
)
spec.expose_inputs(
CommonBandsInputGenerator,
namespace='bands'
)
|
||
namspac = spec.inputs.create_port_namespace('relax_inputs') | ||
namspac.absorb(CommonRelaxInputGenerator.spec().inputs) | ||
namspac['protocol'].non_db = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of doing this manually, I would write a recursive function and call it:
def set_port_non_db(port):
"""Set ``non_db = True`` for the given port.
.. note:: If ``port``, is a port namespace, the function will be called recursively on all sub ports.
:param port: a port or port namespace.
"""
if isinstance(port, PortNamespace):
for subport in port.values():
set_port_non_db(subport)
elif isinstance(port, InputPort):
port.non_db = True
Then in the define
method, you simply call the method:
set_port_non_db(spec.inputs)
namspac3[key].populate_defaults = False | ||
namspac3[key].default = () | ||
namspac3['relax_type'].required = True | ||
namspac3['relax_type'].default = RelaxType.NONE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't rely on this through a default because then a user can run anything else. We should always set this no? So just set it in the workchain step when preparing the inputs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this was against the idea to fully expose the inputs of the sub-processes, but ok, this makes sense. And also avoids the problem of my last comments above.
spec.outline( | ||
cls.initialize, | ||
cls.run_relax, | ||
cls.prepare_bands, | ||
if_(cls.should_run_other_scf)( | ||
cls.fix_inputs, | ||
cls.run_relax | ||
), | ||
cls.run_bands, | ||
cls.inspect_bands | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was wondering if we should have a slightly different outline:
spec.outline( | |
cls.initialize, | |
cls.run_relax, | |
cls.prepare_bands, | |
if_(cls.should_run_other_scf)( | |
cls.fix_inputs, | |
cls.run_relax | |
), | |
cls.run_bands, | |
cls.inspect_bands | |
) | |
spec.outline( | |
cls.setup, | |
if_(cls.should_run_relax)( | |
cls.run_relax, | |
cls.inspect_relax, | |
), | |
if_(cls.should_run_scf)( | |
cls.run_seekpath, | |
cls.run_scf, | |
cls.inspect_scf, | |
), | |
cls.run_bands, | |
cls.inspect_bands, | |
cls.results, | |
) |
I think this is a bit more explicit and clear as to what will happen.
This would make the relax optional, which is still useful, because the user can then compute the bands for an already optimized structure but be sure that the bands are computed in exactly the same way as if it were to include relaxation.
I would also call the second relax run an SCF, because that is what it is, even though we are using the RelaxWorkflow
but that is merely because we decided that this is the way an SCF is run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I am sensing a conceptual difference here. In my mind the user would NOT have the possibility to decide whether to run an extra scf or not. To determine the presence of an extra step is only the fact that seekpath was used (and in the future maybe the decision to us a different code for bands compared to relax). The user could only select some inputs for this step.
Your idea only gives the additional flexibility to run an extra scf even when it is not necessary (when seekpath is not used) and I believe it is a valid use case.
However I have strong doubts about your other suggestion to make the relaxation optional
This would make the relax optional, which is still useful, because the user can then compute the bands for an already optimized structure but be sure that the bands are computed in exactly the same way as if it were to include relaxation.
This can be done already setting the relax_type
to None in the relax
input. So it is not needed. I do not see any other case when this is useful and the price to pay is very high. In fact I believe that making both the relax
step and the scf
step optional is difficult to support. How do we understand that the user wants only the scf
and no the relaxation? If we make populate_defaults = False
and required = False
for both relaxation and scf, then we have to implement additional logic that checks that one of the two is set, and that the one selected sets at least the structure
, protocol
, engines
. Basically it makes useless to expose the inputs since None of the properties of the inputs port can be used, they must be implemented all over again. It seems to me an additional complication that has no motivation to exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that making both optional adds more complexity. But I don't understand your current design. If I understand correctly, in prepare_bands
you check whether explicit kpoints were specified in the bands
option, in which case you don't rerun the relax workflow as scf. But the first relax workflow and seekpath are always run. So it is practically guaranteed that the structure changes, even if the user sets RelaxType.NONE
. Even if the relax doesn't change the structure, it is possible that seekpath still normalizes the structure, which will cause the k-points to longer be commensurate.
I guess what I am saying is that we should make the relax+seekpath optional, or otherwise allowing to directly specify k-points doesn't make any sense. Or we just require relax+seekpath+scf always and just don't support the user specifying their own k-points. At least that would be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not exactly. The relaxation always run. The use of SeeKpath is optional. If you set a kpoints for bands in input, SeekPath is not used. This is a valid use case, for instance, I know the kpoints path along which I want to calculate the bands, and I also want a relaxation of just the atoms coordinates. Of course it does not make much sense to specify a kpoints path and also allow relaxation with variable cell, since you will not know the final cell shape. But the rest is perfectly fine.
When you DO NOT use Seekpath (i.e. you specify kpoints for bands in input), you do not need an extra scf. In fact, you pass to the bands calculation the remote_folder
of the relaxation and this guarantees to copy the correct final structure and the density matrix / wave-functions. However, based to your suggestion, I understood that the possibility for the user to trigger an extra scf anyway, is a valid use case. For instance a user wants to do a final scf with "precise" protocol before calculating bands.
To summarize, three options are available
- relax+seekpath+scf+bands
- relax+kp_in_input+bands
- relax+kp_in_input+scf+bands
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course it does not make much sense to specify a kpoints path and also allow relaxation with variable cell, since you will not know the final cell shape. But the rest is perfectly fine.
This is what was confusing me. Should we maybe have validation in there to prevent this? If you specify kpoints, then the relax cannot change the cell.
To summarize, three options are available
relax+seekpath+scf+bands
relax+kp_in_input+bands
relax+kp_in_input+scf+bands
This is great, this makes things a lot clearer. I would propose to at least describe this in the docstring of the workchain. And then maybe adapt the outline to the following that makes it a lot clearer what is happening:
spec.outline(
cls.setup,
cls.run_relax,
cls.inspect_relax,
if_(cls.should_run_seekpath)(
cls.run_seekpath,
cls.run_scf,
cls.inspect_scf,
).elif_(cls.should_run_scf)(
cls.run_scf,
cls.inspect_scf,
),
cls.run_bands,
cls.inspect_bands,
cls.results,
)
I addressed almost all your comments @sphuber. Two are open for discussion. Thanks for your review. However I found another problematic behavior of the ports. It seems to me that the |
The issue I found is here aiidateam/aiida-core#5286 |
This PR introduce a code agnostic workchain that performs relaxation and subsequently the calculation of bands.
It also has the possibility to use seekpath to generates the high symmetries kpoints path for bands. This process might change the structure, therefore, when seekpath is called, a further scf step is needed on the nex structure. This is automatically handled by the workchain. The inputs for the further scf are exposed. They are the inputs of a
CommonRelaxInpuGenerator
, but where the default relaxation_type is NONE and all the other defaults have been removed. If not defined by user, the same inputs of the first relaxation are used (except the new default for relaxation_type).Limitation of the current implementation (will be improved later):
CommonRelaxInputGenerator
andCommonBandsInputGenerator
respectively and they arenon_db
.So their value will not be stored as inputs of the workchain.