-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider allowing checkout_externals to run even if in-sync components are modified #112
Comments
This desire, or something like it, is coming up again in the context of ESCOMP/CESM#139. To satisfy the needs in that discussion, for this:
I believe we would want to read from the modified sub-externals cfg file (NOT abort). |
A comment I made in ESCOMP/CESM#139 (comment) might be better-suited for this issue ticket:
So if |
@mnlevy1981 note that, in this issue, I'm suggesting something broader that I think also encompasses your suggestion: I'm suggesting that we should consider allowing any files to be modified in an external, as long as that external is already at the right SHA. i.e., manage_externals would only abort if there is a given external that both (a) is at the wrong SHA and (b) has modified files. I'm not tied to that idea, though, and if others feel that is too permissive then I'd be happy with a fallback like the one you suggest. But I'm curious what you think of that broader, more permissive idea. |
We had a discussion today with about six CSEG folks. We talked about @mnlevy1981 idea of an ignore file and we like it. It seems to be the best way to implement this, without being too permissive. We thought it shouldn't be an "invisible" file though. And I think it probably should only be invoked with an option where you give the file so something like... ./manage_externals/checkout_externals --ignore manage_externals_ignore The contents of the file would just be a list of files that can be modified without manage_externals failing. Having an explicit file like this means you have thought through any ramifications of having that file allowed to be modified (or at least you can only blame yourself). This means the two special cases of consideration in the opening text are handled by the user and we don't have to decide on particular behavior for the tool here. |
Sorry I wasn't able to make it to the meeting yesterday. I'm curious: were there specific concerns with always letting manage_externals run if an in-sync component has been modified? I'm having trouble seeing situations where that would cause problems, and as mentioned in my initial comment, there seem to be legitimate times when this extra permissiveness would lead to less frustrations for users. So I'm just wondering if there are downsides to this that others see. |
I am curious about your proposal to require an explicit |
My only concern is that allowing something that used to cause an error is a big change to the design of the software, and so I think any measure that can be taken to minimize that should be. I wasn't involved in the original design meeting that resulted in the current behavior, so I think the best summary of my opinion is "I don't know why it throws an error if it's in a dirty state but at the right SHA, but I wanted to suggest a fix that maintains that behavior as much as possible." As for the behavior itself, I don't have any issue with the idea of only throwing an error if the repo is dirty AND it's at the wrong SHA. It seems like you, @gold2718, and perhaps @jedwards4b are the three people running this repo so if you all think the previous design was a poor choice I'll support your decision to change it. |
I think in the production environment, I'd want manage_externals to abort if it found any of the externals were "dirty". In that case it would only be dirty if something happened on accident -- so I'd want it to abort and CIME to fail so that I know about it. In the development environment I still might want to be warned about an external being dirty -- because it was changed accidentally. Which is part of why I think it should be an option rather than regular behavior. |
I want to explain my thoughts a little more, but I also want to be clear: I don't feel strongly about this so I'm not going to push too hard for this if others feel it's a bad idea. @mnlevy1981 thanks for the explanation. I appreciate your concern for maintaining old behavior. If I remember correctly, Ben Andre (who did much of the initial coding for manage_externals) felt strongly that the result of your sandbox after running manage_externals should always be predictable, and the end state should never depend on the history of what you have done previously in that sandbox (unless manage_externals aborts with an error). I see that point to some respect and agreed that it made sense, until I started seeing that it can cause significant inconvenience in practice... so I have personally started to feel that a slightly less rigid approach is justified in certain cases. (There is more discussion of some related philosophical / high-level design issues in #51 . In particular, see option (5) in #51 (comment) , which considered the issue under discussion here; at the time we came down against this, but that seemed to be based more on gut feelings rather than any particular well-thought-through considerations.) @ekluzek you make good points about the importance of being conservative in a production environment. However, I don't feel that manage_externals is the right place to do these checks. Instead, if we feel we should have checks for a clean sandbox, these checks should be done by cime sometime in the case creation, build or submission process (though it can use manage_externals to help determine the sandbox's full status). If we rely on manage_externals to do this, then it doesn't prevent someone from (1) checking out all externals, followed by (2) modifying some files, then (3) creating a case (without running manage_externals again between (2) and (3)). I think this would be a much more common source of issues. IMHO, having this check in manage_externals can give a false sense of security while still causing annoyances to users. |
@billsacks note that manage_externals is ALREADY being run every time you run a case to do provenance documentation. And as you know CTSM testing runs it before running a test suite. Now it just documents it with the "--status" option, so you can have dirty externals. But, it does run it. If you don't think it should be there, we'd need to remove it from CIME, and then come up with some different type of checker. Personally I think the way it works now makes sense with CIME relying on manage_externals to document a case. |
@ekluzek sorry if I was unclear. I was supporting the use of What I was arguing against was relying on To summarize my feelings in a different way: I feel we should simultaneously:
It seems to me like these two things together achieve the protections people want while also making manage_externals more usable... but maybe I'm missing something. |
Currently, checkout_externals refuses to do any updates if any components are modified ('M' in the second column) – even components that are in-sync. A number of people have questioned the (overly?) conservative nature of this check, and I have seen at least one instance where this check has caused trouble for CESM users (they had some cime changes in their sandbox, but wanted to use checkout_externals to bring in a new version of cism, along with the new version of cism's sub-external, source_cism).
I feel it would be reasonable to allow checkout_externals to proceed in the presence of modifications ('M' state), as long as the out-of-sync externals are unmodified. However, perhaps a warning should be issued in this case?
Some aspects of this that might need some careful thought, though, are related to sub-externals:
What if a component is in-sync but has modified its sub-externals cfg file? Do we read from the modified sub-externals cfg file or abort in this case?
What if a component, 'foo', is out-of-sync, and one of its sub-externals, 'bar', is in-sync but modified? Bringing 'foo' into sync could involve updating its version of 'bar', at which point we'd have an external ('bar') that is both modified and out-of-sync. The safe thing to do would be to abort if, for a given external that is out-of-sync, either that external itself or any of its sub-externals are modified; I'm not sure whether there is an easy way to check that condition currently.
The text was updated successfully, but these errors were encountered: