Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite the link fixer script #2

Merged
merged 2 commits into from
May 28, 2024
Merged

Rewrite the link fixer script #2

merged 2 commits into from
May 28, 2024

Conversation

Wuestengecko
Copy link
Member

This commit greatly refactors the link fixer script. The previous solution was based on detecting links using regular expressions and applying some pattern-based fixes. However, in practice, this approach proved to be a bad one. Luckily it at least didn't make things worse (the model was already broken before anyways), but often enough it didn't produce the desired outcome of a no-longer-broken model.

The new implementation introduced in this commit loads the model using capellambse's MelodyLoader facility, and applies fixes using the semantics-aware methods that the loader offers. Specifically, it tries to parse all XML attributes as a list of element links, discards the (potentially broken) file path from each link, and finally regenerates them using MelodyLoader.create_link.

BREAKING CHANGE: Changed the hook's ID to fix-capella-fragment-links. Experimentation has unfortunately shown the new implementation to be noticably slower than the old one - to remedy this, the hook now skips itself during the follow-up commit, so as to not run twice. The ID change was made to more reliably identify the hook, and reduce the chance for name clashes with a different hook that might fix some other type of link in some other type of file.

@Wuestengecko Wuestengecko marked this pull request as draft November 6, 2023 13:41
@Wuestengecko
Copy link
Member Author

I noticed that I missed the special handling of <semanticResources> text. I'll add that back.

This commit greatly refactors the link fixer script. The previous
solution was based on detecting links using regular expressions and
applying some pattern-based fixes. However, in practice, this approach
proved to be a bad one. Luckily it at least didn't make things worse
(the model was already broken before anyways), but often enough it
didn't produce the desired outcome of a no-longer-broken model.

The new implementation introduced in this commit loads the model using
capellambse's MelodyLoader facility, and applies fixes using the
semantics-aware methods that the loader offers. Specifically, it tries
to parse all XML attributes as a list of element links, discards the
(potentially broken) file path from each link, and finally regenerates
them using `MelodyLoader.create_link`.

BREAKING CHANGE: Changed the hook's ID to `fix-capella-fragment-links`.
Experimentation has unfortunately shown the new implementation to be
noticably slower than the old one - to remedy this, the hook now skips
itself during the follow-up commit, so as to not run twice. The ID
change was made to more reliably identify the hook, and reduce the
chance for name clashes with a different hook that might fix some other
type of link in some other type of file.
@Wuestengecko Wuestengecko marked this pull request as ready for review November 6, 2023 14:37
Copy link
Collaborator

@micha91 micha91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looking good. As I don't know capellambse that good, I cannot not say whether this covers all necessary links. I would recommend to collect error messages and if there are any, we could let the hook fail after committing the changes. This way links would be fixed on best effort basis and the user would be informed that he has to get in touch with technical support. In addition, we could also run the previously used regex for validation purposes at the end of the hook. This way we could figure out, whether there are any broken links after the model was fixed by this script

.rstrip("\0")
.split("\0")
)
if dirty_files != [""]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we won't be able to split changes in different commits anymore? E.g. I changed something in the SA layer and in the OA layer and want to commit it separately - in this case the script will fail. I know the struggle here and in a perfect world the user would have had committed his changes from the one layer before making changes in another one, but we should at least make people aware of that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pre-commit should stash away all not-to-be-committed changes before it runs any hooks, so it should actually work the same in that regard. Although I'm not 100% sure if it also does that for "post" hooks - maybe something to check before merging this.

targetfrag = element.text.rsplit("/", 1)[-1]
frags = [i for i in model.trees if i.name == targetfrag]
if len(frags) != 1:
raise RuntimeError(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we execute this as a post-commit-hook, the hook would in this case fail, but the users commit (which may contains several broken links) would still be present and could be pushed. Imo it would be a better to fix the rest of the model on best effort and report errors at the end of the hook

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So do you propose to just not touch links with ambiguous target at all (but still fix all others)? I suppose that would at least fix some of the issues, but the model would still be in a broken state anyways.

In any case, definitely something the user should be aware of. I'm just not sure how we could make the user aware, as Capella generally hides hook output completely (even if it fails). Maybe we could try to show a message box with zenity or something similar if that's installed?

@@ -1,7 +1,7 @@
- id: fix-links
- id: fix-capella-fragment-links
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this repository is for capella-git-hooks, I don't think it is necessary to include capella in the hook ID. I get the point of adding fragment to the ID, but I am not sure whether this justifies the additional work it takes to change the hook in the pre-commit config of the projects...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this scope, yes, but the hook ID is also used in the (pre-commit wide) SKIP environment variable. That's why I included "capella" there. See the commit command that this script executes (translated to bash):

SKIP=fix-capella-fragment-links git commit -m "Fix links yada yada"

try:
target = model.follow_link(None, target_id)
except KeyError:
LOGGER.error("Cannot repair dangling link: #%s", target_id)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user will never find this error message. Maybe we should think about collecting error messages and let the hook fail after repairing the model on best effort basis?

@Wuestengecko
Copy link
Member Author

Wuestengecko commented Nov 21, 2023

I would recommend to collect error messages and if there are any, we could let the hook fail after committing the changes. This way links would be fixed on best effort basis and the user would be informed that he has to get in touch with technical support.

Problem is that (as I already mentioned) failures in post-hooks are non-critical, and Git won't even inform Capella about it (so we can't even blame Capella this time ;) ):

post-commit
This hook is meant primarily for notification, and cannot affect the outcome of git commit.

(From githooks(5))

Try this:

$ cat .git/hooks/post-commit
#!/bin/bash -e
echo 💣
exit 1
$ git commit --allow-empty -m "Test"
💣
[master 15e0282] Test
$ echo $?
0

So we need some other way of communicating to the user. Maybe we could use something like Zenity to show simple message boxes?

In addition, we could also run the previously used regex for validation purposes at the end of the hook. This way we could figure out, whether there are any broken links after the model was fixed by this script

One major reason why I started this rewrite was that the regexp turned out to not even find half of the broken links in one case I saw. So while it might be a good idea just to double-check that we didn't miss anything obvious, we also need to be aware that these regexps aren't fail-safe either. :)

... And then again that begs the question of how we get this information to the user.

Time will tell us if we're really incompatible or we were just being
overly pessimistic.
@Wuestengecko Wuestengecko merged commit 2f0d337 into main May 28, 2024
6 checks passed
@Wuestengecko Wuestengecko deleted the refactor-fix-links branch May 28, 2024 16:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants