-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kececioglu matcher #9
base: master
Are you sure you want to change the base?
Conversation
stack to constructor and renamed variables for readability
MaximumMatching tests
Need some time to review but looks very interesting, thanks for contributing. The minimal speed gain is likely because of the approximate matching first and this covers almost all cases. In fact in Beam v2 (which will be out at some point) I just use an exhaustive search to clean up the Kekulization/Localisation of bonds since often the greedy matching will find a solution with only two unmatched nodes - in which case the task is just to see if there is a path between them. From my thesis: |
It's not obvious from the links I provided but I was essentially basing this implementation on the prior C implementation described in the paper which was provided by the author here: https://www2.cs.arizona.edu/~kece/Research/code/matching1.sh.Z |
Sorry been busy so only just looked at this, could you add a copyright/license header to the new files. Also I would add a comment pointing to the C implementation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry saw the C link was in there, needs a capital Z though.
@@ -39,22 +39,22 @@ | |||
* "Computing maximum-cardinality matchings in sparse general graphs." | |||
* Proceedings of the 2nd Workshop on Algorithm Engineering, 121-132, 1998. | |||
* and from their C implementation at | |||
* https://www2.cs.arizona.edu/~kece/Research/code/matching1.sh.Z | |||
* https://www2.cs.arizona.edu/~kece/Research/code/matching1.sh.z |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
capital Z needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally I had intended to write the algorithm from scratch based on the description in the paper, however I ended up using the C code as a guide and followed it pretty closely, thus I think the licensing of the original code should apply. Unfortunately, I don't know the original license, so I suppose I'll need to determine that and its compatibility with Beam's license before this can be pulled in. I guess I got my sorting of horses and carts reversed.
KececiogluMatching is a depth-first implementation of Edmond's maximum matching algorithm with early-termination test heuristic as described in
"Computing maximum-cardinality matchings in sparse general graphs."
Proceedings of the 2nd Workshop on Algorithm Engineering, 121-132, 1998.
https://www2.cs.arizona.edu/~kece/Research/abstracts.html#KP98 and
https://www2.cs.arizona.edu/~kece/Research/papers/KP98.ps
The hope was that this algorithm might improve the speed of matching for chemical graphs despite their low degree. Thus it was designed to act as a drop in replacement for the MaximumMatching class. However, on my system the algorithm performs nearly identically in terms of execution speed as the MaximumMatcher adapted from Eppstein when run against the 5616 ChEMBL smiles that require the MaximumMatching algorithm. Of the 5616 smiles in ChEMBL requiring the algorithm, 256 result in a different matching when using KececiogluMatching and manual inspection of a random subset of those indicates that they are valid resonance forms. Given the lack of performance gain, I did not incorporate the KececiogluMatching into Localise, but the code is written and could be useful to someone and maybe there is performance yet to squeeze from it.
I created new tests that mimic MaximumMatchingTest and all pass.