Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add benchmark project and action to alert performance issues #209

Closed
wants to merge 34 commits into from

Conversation

OptimumCode
Copy link
Collaborator

@OptimumCode OptimumCode commented Jun 29, 2024

Hi, this is the initial version of the PR. @aSemy could you please take a look?

I faced some issues when adding the benchmark:

  1. For JS target I got an error during benchmark generation:
    KLIB resolver: Could not find "net.thauvin.erik.urlencoder:urlencoder-lib" in [.../snakeyaml-engine-kmp]
    
    I have not figured out the exact reason, but I added this library explicitly into jsMain dependencies and it started working. Maybe anyone has seen something like that before...
  2. The GitHub action works only with a single result file. Because of that, it is a bit tricky to upload results for other platforms (especially because benchmarks for windows and macos must be executed on corresponding runners).
    I think that we could create a reusable workflow that executes all benchmarks available for a particular runner (ubuntu, windows, macos), collects the results (using artifacts or github cache) and uploads all the results in a separate job using matrix build (which contains all the supported targets for that runner). But I don't know how this action will work if those workflows try to commit changes in parallel. I think nothing good will happen... Maybe anyone has other ideas (create a PR to the action repo with multi-file support?).

Here is what the generated page with benchmark results looks like:
https://optimumcode.github.io/snakeyaml-engine-kmp/dev/bench/

To make it work you need to do some preparations first (to allow action publish the result to GitHub pages).

Branch with benchmark results:
https://github.com/OptimumCode/snakeyaml-engine-kmp/tree/gh-pages/dev/bench

Here is an example of PR that introduces a performance issue:
OptimumCode#1

Also, every workflow run generates a summary where you can see how the benchmark results have changed:
https://github.com/OptimumCode/snakeyaml-engine-kmp/actions/runs/9725885272?pr=1

Resolves #208

@OptimumCode
Copy link
Collaborator Author

OptimumCode commented Jun 29, 2024

And I think this PR should be merged after #207 (I will rebase this branch once it is merged)

@OptimumCode
Copy link
Collaborator Author

There is a suggestion on how multiple benchmark results can be combined and processed together.

@aSemy aSemy self-requested a review June 30, 2024 06:46
@OptimumCode
Copy link
Collaborator Author

I have managed to join results from multiple runs in a single file and uploaded them using benchmark-action. This is workflow step to do that (and the result). Could the same approach be used here? What do you think?

@aSemy
Copy link
Collaborator

aSemy commented Jul 1, 2024

Thanks for this! I will take a look.

Comment on lines 37 to 39
// TODO: there is a Load class in JVM sources that can handle all of it
// but it is not available for common code.
// Probably, it should be moved from JVM sources to common sources.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I'll try to commonize Load.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


permissions:
# deployments permission to deploy GitHub pages website
deployments: write
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought: maybe writing to GitHub pages should only be enabled when running on the main branch? If I understand the benchmark action, then the benchmarks should only be updated when running on main, but I wonder if an additional guard would help.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot find how the permissions can be dynamically modified (github context is not available in that scope). I can split this workflow into two with separate sets of permissions and a reusable workflow to reduce duplication. It should resolve concerns about writing something to gh-pages in PRs

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But still, it would be possible to modify the PR workflow and push results from PR (it just will be more obvious because the permissions will be modified)

settings.gradle.kts Outdated Show resolved Hide resolved
Comment on lines 61 to 64
implementation("net.thauvin.erik.urlencoder:urlencoder-lib:1.5.0") {
because("error during benchmark generation: e:\n" +
"KLIB resolver: Could not find \"net.thauvin.erik.urlencoder:urlencoder-lib\" in [.../snakeyaml-engine-kmp]")
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an unexpected error! I wonder why it doesn't happen on other targets. I'll take a look...

Maybe snake-kmp needs to mark it as api()?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might be related to how the benchmark plugin resolves dependencies when compiling generated benchmarks. But I believe snake-kmp should not mark this dependency as api because it does not expose this library in public API (even if it fixes this problem for some reason)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are actually right... if you declare the dependency to that library as api, the benchmark for JS compiles without any issues. But it is not right to include a library as api if it is not used in public API. I will raise a question in kotlinx-benchmark repo - maybe they can help with that

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already an issue for that: Kotlin/kotlinx-benchmark#185
Re-declaration of dependencies.

settings.gradle.kts Outdated Show resolved Hide resolved
@OptimumCode
Copy link
Collaborator Author

Hi, @aSemy. I want to rebase the branch on the latest main state. Please, let me know if you are doing some changes - I will do a merge instead of rebase in this case (but I would prefer a liner history)

@aSemy
Copy link
Collaborator

aSemy commented Jul 1, 2024

I'm not working on this branc, go ahead and rebase if you'd like. We'll squash merge the branch into main, so I wouldn't worry too much about rebasing vs merging.

@OptimumCode
Copy link
Collaborator Author

Hi, @aSemy
I merged the results for all platforms and they are all available on one page (https://optimumcode.github.io/snakeyaml-engine-kmp/dev/bench/). Does it look okay? Could you please check when you have time?

@OptimumCode
Copy link
Collaborator Author

Hi, @krzema12. Could you please also take a look at the PR?

The example can be found in the forked repo: OptimumCode#1
https://optimumcode.github.io/snakeyaml-engine-kmp/dev/bench/

@krzema12
Copy link
Owner

@OptimumCode thanks for this, it looks good in general, but we need to sort out one problem. From what I see, the new workflow publishes the report via GitHub Pages, but we already use it to publish API reference: https://github.com/krzema12/snakeyaml-engine-kmp/blob/main/.github/workflows/publish-site.yml.

One idea is to deploy both the docs and the performance report using GH Pages, but under different paths. Then we'd also need to ensure that both workflows don't override each other's results.

I'm gonna sleep on this problem, maybe a better solution will appear, but I'm also open to your ideas!

@OptimumCode
Copy link
Collaborator Author

Hi, @krzema12. Thank you for pointing out that you already use the github pages in another workflow. I looked into documentation for actions/upload-pages-artifact, actions/deploy-pages and benchmark-action/github-action-benchmark actions - seems like they won't work together without some manual actions...
actions/deploy-pages expects a single artifact with all content for github-pages that fully overrides the previously deployed data. This would mean that results from benchmark-action/github-action-benchmark will override the previously deployed documentation.

I have one thought here: benchmark-action/github-action-benchmark action supports publishing results into another repository. What do you think about creating a snakeyaml-engine-kmp-benchmarks (or any other name you prefer) repository and publishing results into this repo?

@krzema12
Copy link
Owner

I'm fine with the separate repo! I'll let you know once created. Do I need to set up any extra perms anywhere?

@OptimumCode
Copy link
Collaborator Author

You will need to create a personal access token with deployment and contents write permissions for this new repo and add it to the secrets here

@krzema12
Copy link
Owner

@OptimumCode done! The secret is called PUBLISH_BENCHMARK_RESULTS and will expire in a year (it's impossible to set no expiration date). The dedicated repo is here: https://github.com/krzema12/snakeyaml-engine-kmp-benchmarks.

@OptimumCode
Copy link
Collaborator Author

Thank you, @krzema12. I will make necessary changes and let you know once it is done

@OptimumCode
Copy link
Collaborator Author

Hi, @krzema12. Could you please advise which branch have you configured for GitHub pages in this new repo? If you haven't could you please create a branch (gh-pages is a default value expected by the action) and configure GitHub pages to serve from this branch?
I completely forgot about this when you asked for what is needed.

@krzema12
Copy link
Owner

@OptimumCode I've just configured gh-pages branch.

@OptimumCode
Copy link
Collaborator Author

Hi, @krzema12. Could you please check if the name of the secret is correct? Either the name is not correct or the secret is not visible in pr workflow...

One more thing, could you please add pull_request write permission to this token for main repo (not for repo with benchmark results)? This is needed to generate an alert comment

@krzema12
Copy link
Owner

Hi @OptimumCode!

Could you please check if the name of the secret is correct? Either the name is not correct or the secret is not visible in pr workflow...

Yeah, the name is correct, checked by copy-pasting between GitHub's UI and workflow's definition:

Screenshot 2024-07-26 at 09 03 58

What I can do is regenerate the token and update the secret, to make sure I correctly pasted it. I've just done it - let's try now!

One more thing, could you please add pull_request write permission to this token for main repo (not for repo with benchmark results)? This is needed to generate an alert comment

I think you mean ticking the "Allow GitHub Actions to create and approve pull requests" checkbox, is it correct? If yes, I've just done it!

@OptimumCode
Copy link
Collaborator Author

I think you mean ticking the "Allow GitHub Actions to create and approve pull requests" checkbox, is it correct? If yes, I've just done it!

I am not sure this will do the trick. I was talking about token permissions:

image

@krzema12
Copy link
Owner

@OptimumCode what you show on the screenshot looks like configuring a personal access token's perms - I see this when configuring the PAT for publishing the report to the other repo.

For the GitHub token, I'm only aware of the setting I mentioned, and setting fine-grained perms via YAML, see https://github.blog/changelog/2021-04-20-github-actions-control-permissions-for-github_token/

I'll try digging deeper into why things don't work as expected here.

@OptimumCode
Copy link
Collaborator Author

OptimumCode commented Jul 26, 2024

Yes, it is PAT token configuration. We use it instead of GitHub token in benchmark action and in order to generate alert comments this token should have permissions for that (I will remove permissions configuration from workflow itself as we use the PAT token)

I'll try digging deeper into why things don't work as expected here.

Please let me know if I can help here anyhow (right now I have a limited access to the internet but in few hours I will be fully available)

@krzema12
Copy link
Owner

@OptimumCode ok, got it now! I somehow thought that the GH token is also used, but now I see that we switched to PAT to be able to write to another repo. I've adjusted the PAT's perms, let's see now.

@OptimumCode
Copy link
Collaborator Author

Superseded by #215

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add benchmarks to catch performance degradation on earlier stages
3 participants