Rewrite the link fixer script #2

Wuestengecko · 2023-11-06T13:36:13Z

This commit greatly refactors the link fixer script. The previous solution was based on detecting links using regular expressions and applying some pattern-based fixes. However, in practice, this approach proved to be a bad one. Luckily it at least didn't make things worse (the model was already broken before anyways), but often enough it didn't produce the desired outcome of a no-longer-broken model.

The new implementation introduced in this commit loads the model using capellambse's MelodyLoader facility, and applies fixes using the semantics-aware methods that the loader offers. Specifically, it tries to parse all XML attributes as a list of element links, discards the (potentially broken) file path from each link, and finally regenerates them using MelodyLoader.create_link.

BREAKING CHANGE: Changed the hook's ID to fix-capella-fragment-links. Experimentation has unfortunately shown the new implementation to be noticably slower than the old one - to remedy this, the hook now skips itself during the follow-up commit, so as to not run twice. The ID change was made to more reliably identify the hook, and reduce the chance for name clashes with a different hook that might fix some other type of link in some other type of file.

Wuestengecko · 2023-11-06T13:42:40Z

I noticed that I missed the special handling of <semanticResources> text. I'll add that back.

This commit greatly refactors the link fixer script. The previous solution was based on detecting links using regular expressions and applying some pattern-based fixes. However, in practice, this approach proved to be a bad one. Luckily it at least didn't make things worse (the model was already broken before anyways), but often enough it didn't produce the desired outcome of a no-longer-broken model. The new implementation introduced in this commit loads the model using capellambse's MelodyLoader facility, and applies fixes using the semantics-aware methods that the loader offers. Specifically, it tries to parse all XML attributes as a list of element links, discards the (potentially broken) file path from each link, and finally regenerates them using `MelodyLoader.create_link`. BREAKING CHANGE: Changed the hook's ID to `fix-capella-fragment-links`. Experimentation has unfortunately shown the new implementation to be noticably slower than the old one - to remedy this, the hook now skips itself during the follow-up commit, so as to not run twice. The ID change was made to more reliably identify the hook, and reduce the chance for name clashes with a different hook that might fix some other type of link in some other type of file.

micha91

Overall looking good. As I don't know capellambse that good, I cannot not say whether this covers all necessary links. I would recommend to collect error messages and if there are any, we could let the hook fail after committing the changes. This way links would be fixed on best effort basis and the user would be informed that he has to get in touch with technical support. In addition, we could also run the previously used regex for validation purposes at the end of the hook. This way we could figure out, whether there are any broken links after the model was fixed by this script

micha91 · 2023-11-13T10:27:32Z

capella_git_hooks/fix_links.py

+        .rstrip("\0")
+        .split("\0")
+    )
+    if dirty_files != [""]:


So we won't be able to split changes in different commits anymore? E.g. I changed something in the SA layer and in the OA layer and want to commit it separately - in this case the script will fail. I know the struggle here and in a perfect world the user would have had committed his changes from the one layer before making changes in another one, but we should at least make people aware of that.

pre-commit should stash away all not-to-be-committed changes before it runs any hooks, so it should actually work the same in that regard. Although I'm not 100% sure if it also does that for "post" hooks - maybe something to check before merging this.

micha91 · 2023-11-14T07:09:41Z

capella_git_hooks/fix_links.py

+            targetfrag = element.text.rsplit("/", 1)[-1]
+            frags = [i for i in model.trees if i.name == targetfrag]
+            if len(frags) != 1:
+                raise RuntimeError(


As we execute this as a post-commit-hook, the hook would in this case fail, but the users commit (which may contains several broken links) would still be present and could be pushed. Imo it would be a better to fix the rest of the model on best effort and report errors at the end of the hook

So do you propose to just not touch links with ambiguous target at all (but still fix all others)? I suppose that would at least fix some of the issues, but the model would still be in a broken state anyways.

In any case, definitely something the user should be aware of. I'm just not sure how we could make the user aware, as Capella generally hides hook output completely (even if it fails). Maybe we could try to show a message box with zenity or something similar if that's installed?

micha91 · 2023-11-14T07:17:23Z

.pre-commit-hooks.yaml

@@ -1,7 +1,7 @@
-   id: fix-links
+-   id: fix-capella-fragment-links


As this repository is for capella-git-hooks, I don't think it is necessary to include capella in the hook ID. I get the point of adding fragment to the ID, but I am not sure whether this justifies the additional work it takes to change the hook in the pre-commit config of the projects...

Looking at this scope, yes, but the hook ID is also used in the (pre-commit wide) SKIP environment variable. That's why I included "capella" there. See the commit command that this script executes (translated to bash):

SKIP=fix-capella-fragment-links git commit -m "Fix links yada yada"

micha91 · 2023-11-14T07:19:32Z

capella_git_hooks/fix_links.py

+                try:
+                    target = model.follow_link(None, target_id)
+                except KeyError:
+                    LOGGER.error("Cannot repair dangling link: #%s", target_id)


The user will never find this error message. Maybe we should think about collecting error messages and let the hook fail after repairing the model on best effort basis?

Wuestengecko · 2023-11-21T09:52:37Z

I would recommend to collect error messages and if there are any, we could let the hook fail after committing the changes. This way links would be fixed on best effort basis and the user would be informed that he has to get in touch with technical support.

Problem is that (as I already mentioned) failures in post-hooks are non-critical, and Git won't even inform Capella about it (so we can't even blame Capella this time ;) ):

post-commit
This hook is meant primarily for notification, and cannot affect the outcome of git commit.

(From githooks(5))

Try this:

$ cat .git/hooks/post-commit
#!/bin/bash -e
echo 💣
exit 1
$ git commit --allow-empty -m "Test"
💣
[master 15e0282] Test
$ echo $?
0

So we need some other way of communicating to the user. Maybe we could use something like Zenity to show simple message boxes?

In addition, we could also run the previously used regex for validation purposes at the end of the hook. This way we could figure out, whether there are any broken links after the model was fixed by this script

One major reason why I started this rewrite was that the regexp turned out to not even find half of the broken links in one case I saw. So while it might be a good idea just to double-check that we didn't miss anything obvious, we also need to be aware that these regexps aren't fail-safe either. :)

... And then again that begs the question of how we get this information to the user.

Time will tell us if we're really incompatible or we were just being overly pessimistic.

Wuestengecko marked this pull request as draft November 6, 2023 13:41

Wuestengecko force-pushed the refactor-fix-links branch from 2e3519a to 91ee53d Compare November 6, 2023 14:36

Wuestengecko marked this pull request as ready for review November 6, 2023 14:37

micha91 reviewed Nov 14, 2023

View reviewed changes

build: Don't constrain upper Python version

2f0d337

Time will tell us if we're really incompatible or we were just being overly pessimistic.

Wuestengecko merged commit 2f0d337 into main May 28, 2024
6 checks passed

Wuestengecko deleted the refactor-fix-links branch May 28, 2024 16:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite the link fixer script #2

Rewrite the link fixer script #2

Wuestengecko commented Nov 6, 2023

Wuestengecko commented Nov 6, 2023

micha91 left a comment

micha91 Nov 13, 2023

Wuestengecko Nov 21, 2023

micha91 Nov 14, 2023

Wuestengecko Nov 21, 2023

micha91 Nov 14, 2023

Wuestengecko Nov 21, 2023

micha91 Nov 14, 2023

Wuestengecko commented Nov 21, 2023 •

edited

Loading

		@@ -1,7 +1,7 @@
		- id: fix-links
		- id: fix-capella-fragment-links

Rewrite the link fixer script #2

Rewrite the link fixer script #2

Conversation

Wuestengecko commented Nov 6, 2023

Wuestengecko commented Nov 6, 2023

micha91 left a comment

Choose a reason for hiding this comment

micha91 Nov 13, 2023

Choose a reason for hiding this comment

Wuestengecko Nov 21, 2023

Choose a reason for hiding this comment

micha91 Nov 14, 2023

Choose a reason for hiding this comment

Wuestengecko Nov 21, 2023

Choose a reason for hiding this comment

micha91 Nov 14, 2023

Choose a reason for hiding this comment

Wuestengecko Nov 21, 2023

Choose a reason for hiding this comment

micha91 Nov 14, 2023

Choose a reason for hiding this comment

Wuestengecko commented Nov 21, 2023 • edited Loading

Wuestengecko commented Nov 21, 2023 •

edited

Loading