-
-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Classpath stability for tests #261
Comments
This should be solved by #263 @seancorfield I think. |
No. This is not fixed. Please reopen it. |
I now sort the libs (line 17) in the issue-261 branch: polylith/components/test-runner-orchestrator/src/polylith/clj/core/test_runner_orchestrator/core.clj Lines 15 to 18 in ab63192
I'm not sure if that solves your problem @seancorfield. I couldn't figure out how to let the poly tool produce the exact same order as |
That may make things worse -- but it does at least make things predictable (even if it ends up being predictably wrong). I want to keep this open until I can spend some time looking into this problem. It's not going to affect most people, with smaller projects, but this sort of thing can have a really bad impact on projects of the scale we have at work. |
I think I've also encountered this bug. In my case I get this error while running
I do have I still get the above error. Inside dev repl I can create the component that creates a hikari datasource using next jdbc component . |
Is it possible to reproduce the problem in a smaller public repo and share it with me @ieugen? |
@tengstrand : I added a repro in this PR ieugen/poly-rcf#1 . I run Big questions are:
|
Any update on this issue? We're encountering the same problem on some machines when running poly tests and at first we thought it's related to a specific dependency's problem as we've logged here: but then I discovered this issue which might explain better what we're seeing. |
I have some ideas on how to solve this, so will have a look. |
The reason we haven't spent any time on this at work (since we were the most affected and the first to be affected) is that we switched to the external test runner and, combined with a small change to our dependencies (which I no longer remember, sorry), the problem went away. I am concerned that any changes to the classpath order at this point could break someone's setup, i.e., making it worse for some people, even if it makes it better for others. Part of the problem here is that Polylith's decision to have I've continued to think about this problem on and off over the past year and I still don't have a good answer. The closest I've come is to make a fork of |
I have copied (and slightly changed) the poly-rcf example to examples/poly-rcf in the polylith repo, that also includes all the instructions on how to test the code and what changes that have been made. I also wanted to see if the code worked as a monolith, so I created a monolith directory that contains the same code and libraries. I still don't understand why the default test runner doesn't behave exactly the same on all machines. I get that it doesn't behave the same as tools.deps, but it should at least behave the same every time. So if it works on one machine, it should also work on another machine? Maybe you have some ideas? |
The main problem was that the tests didn't run in sufficient isolation, but when I switched to the external test runner it worked! I have summarized what I did here. |
Good to know that the external test runner is a potential solution here. One extra thing it has allowed us to do (and a blog post is coming on this at some point): merge our The new poly API also allows us to compute changed projects in |
I was thinking that it could be time to get rid of these isolation problems. My idea is to copy the external test runner code into the polylith repo (and make necessary modifications). What do you think @seancorfield? Then we could either make this the default test runner (in that case we need to discuss this with the community). Or we could set it in the global test config when a workspace is created. When we implement a test runner for ClojureScript (when we have support for cljs) we could make it isolated too, and everything would be consistent. @furkan3ayraktar @imrekoszo |
I'm torn. I think encouraging multiple test runner implementations is a good thing. I don't think favoring one over another is ideal. For most projects, Polylith's in-process test runner is just fine. It was fine for us for quite a while, even at the scale (complexity) we reached. Some people are going to want to use Kaocha -- and maybe they'd want the option of an in-process vs external test running (the latter is not available as I understand it?). There are also all sorts of controls folks might want to apply to their test running process: should it include Basically, there are all sorts of "knobs and dials" that test runners provide that -- currently -- I'd prefer to stay with the current overall machinery but improve the test runner "API" and make it easier for |
Good points! |
I agree with Sean's last comment here, here are my thoughts:
Yep, I have been wanting to add support for this to polylith-kaocha but I haven't got to it so far.
This is very true. There is a good blog post from back in 2018 about the various responsibilities around running tests, and how Kaocha chose to address a subset of those. I had made a related comment on another issue here in 2021 and I still agree with the main points in that:
A default, built-in test runner helps a lot with getting started with a workspace (and also by serving as an example for 3rd party runners). Whether that's an in-process or external (or there are built-in options for both) doesn't really matter IMO. BTW, following up on Sean's suggestion about perhaps forking tools.deps for classpath calculation, there's a lib for classloader management - perhaps it can be helpful: https://github.com/lambdaisland/classpath Edit: now that I look at the api of that lib, it might not be that useful after all |
Good input @imrekoszo. |
Hey @tengstrand we have since swapped out dependencies in our project (due to other reasons) and are not seeing this problem anymore with our current set of dependencies when running poly tests. |
Not yet. I am busy with some other work. I might not be able to do that in the near future. |
I created issue 448. Please take a look and see if it looks reasonable @seancorfield @ieugen. |
I've responded there but I'm not sure why you've created that issue since it doesn't appear to solve any issue raised here (or anywhere else)? |
I added the link to issue 448 because it was mentioned in the workspace that reproduced this issue. But yes, you are right Sean that it wasn't directly related to this issue. |
Describe the bug
The classpath that
poly test
calculates is not stable/reproducible across machines, leading to unpredictable behavior of tests across multiple machines.To Reproduce
I added debug output to the test orchestrator to print
all-paths
so that I could see the order being different on multiple machines. The Jetty libraries have changed their group/artifact over time but retained some of the same class names which means that a non-deterministic ordering of the classpath means that you can have tests pass on one machine but fail on another due to that ordering cause a different artifact to be found "first" on the classpath and an incompatible class used.Expected behavior
tools.deps.alpha
uses a specific sorting algorithm to ensure a reproducible classpath across all environments -- the ideal would be Polylith producing that exact same order (this is tricky becausepoly test
has to construct a projection of the:test
alias across all the bricks in order to construct the "basis"). Not so ideal but acceptable would at least be for the order of the classpath produced bypoly test
to be deterministic. That means not using sets or hash maps to store elements of the classpath.The text was updated successfully, but these errors were encountered: