Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Edge stability bot is potentially unstable? #5947

Closed
gsnedders opened this issue May 16, 2017 · 26 comments
Closed

Edge stability bot is potentially unstable? #5947

gsnedders opened this issue May 16, 2017 · 26 comments

Comments

@gsnedders
Copy link
Member

cc/ @jgraham @bobholt @domenic

#5944 and #5942 have both hit the Edge bot giving somewhat odd unstable results (and stable in every other browser); this and this are the Travis jobs for those.

#5944 I think might be random in Edge down to it trying to open third-party applications (scheme-dependent) and popups (from window.open, which we probably don't allow over Sauce?), and then racing as to whether one of those blocks the test harness?

#5942 I suspect is just a bug in Edge?

@gsnedders
Copy link
Member Author

There's an ignore-edge branch ready to go on my fork if we want to land something to allow failures on the Edge stability checker; not creating a PR lest someone blindly merge it immediately.

@gsnedders
Copy link
Member Author

My current guess for #5942 is that it's some JIT bug in Edge around HTMLAllCollection.

@jgraham
Copy link
Contributor

jgraham commented May 16, 2017

Pretty sure we must be allowing window.open because otherwise the harness wouldn't work.

It might be possible to get video out of sauce to investigate these issues. @plh has access, and we might need to enable an option somewhere to cause it to be recorded.

@jgraham
Copy link
Contributor

jgraham commented May 17, 2017

https://wiki.saucelabs.com/display/DOCS/Viewing+Screenshots%2C+Commands%2C+Logs%2C+and+Metadata+for+Test+Results is the documentatiaon about viewing the video.

I think we should make Edge non-blocking for now.

@annevk
Copy link
Member

annevk commented May 17, 2017

Can we do this? #5960 is also affected.

@bobholt
Copy link
Contributor

bobholt commented May 17, 2017

#5942 definitely looks like an Edge bug. #5944 looks like some sort of error or timeout on the sauce side.

window.open should work. We have to run scripts before the tests on every Sauce run to disable the popup blockers. That was the hardest thing about getting the tests to run on Sauce - tracking down the registry key to disable Edge's popup blocker.

I don't believe there's an option to allow the screenshots/video - I think it's just there by default (I remember seeing it during my testing).

I'm 👍 on adding both Edge and Safari to allow failures as long as the instability is still addressed.

@foolip
Copy link
Member

foolip commented May 17, 2017

It's probably best to make them non-blocking until we have some history to see the rate of failures. However, we will occasionally have flakiness due to bugs in all browsers, and currently the only workaround is a manual merge. Should we have the equivalent of a "I confirm that this flakiness isn't my fault" button (a label?) to allow merging with flaky results in general?

@bobholt
Copy link
Contributor

bobholt commented May 17, 2017

Super-admins (or whatever the GitHub permission is) like @jgraham can merge with failures. I believe in the past, iit's been a matter of escalating a PR to one of them. I haven't had to deal with that, but watching them go by on IRC, it has appeared to work. However, there may be a better process that could be put into place.

@annevk
Copy link
Member

annevk commented May 17, 2017

FWIW, that process is not ideal (often comments on IRC end up not getting addressed) and requiring everyone to rebase because we misconfigured a new tool isn't great either.

@foolip
Copy link
Member

foolip commented May 17, 2017

I have the permission bit in question and have been taking care of Chromium exports, but I also don't think it's ideal. If it only happens once a month it'd be OK, but if wpt contributions increase that seems unlikely.

@gsnedders
Copy link
Member Author

#5960 is the same test as #5942 being flaky, so it's not additional flakiness per-se. OTOH, that's a file likely to get modified relatively often…

@gsnedders
Copy link
Member Author

#5965 has made it ignored.

@bobholt I still wonder if #5944 is actually something related to the fact it opens up prompts to open other programs and that somehow blocking it (if the window needs focus or something for WebDriver to work?).

@jgraham
Copy link
Contributor

jgraham commented May 17, 2017

So I think there are two issues here:

  • Edge is flaky. We should make it not block PRs and get access to the video from sauce to try to understand what's wrong there.

  • Sometimes PRs are blocked by flakiness. Whilst I understand that's a pain, I think having a simple way to say "I don't care about flakiness" is pretty open for abuse. I guess I would be happier with this idea if the procedure was "provide a link to a bug report on the browser describing the flakiness", so that people couldn't just mark random tests as "flaky" on browsers they they don't care about (e.g. gecko devs ignoring Safari flakiness) without making any serious attempt to investigate whether it's avoidable or a test issue, or whatever.

@jugglinmike
Copy link
Contributor

I guess I would be happier with this idea if the procedure was "provide a
link to a bug report on the browser describing the flakiness", so that
people couldn't just mark random tests as "flaky" on browsers they they
don't care about (e.g. gecko devs ignoring Safari flakiness) without making
any serious attempt to investigate whether it's avoidable or a test issue,
or whatever.

I was going to recommend gating on validation from a quorum of the available
browsers (rather than all of them), but this is much more constructive. We
could support this via GitHub comments, but if we instead required the bug
report reference in a commit message, then we'd also be increasing the value of
the project's git history. The workflow would entail pushing empty commits with
bug references, and it would require reviewers to be more careful about how they
squash commits prior to landing. Would that be more trouble than it's worth?

@gsnedders
Copy link
Member Author

#5942 as far as I can tell was an Edge bug fixed in 15, but Sauce Labs only has 14. :(

@gsnedders
Copy link
Member Author

Given that's a fixed bug in an outdated version, it doesn't seem to make sense to require a bug report for that?

@foolip
Copy link
Member

foolip commented May 17, 2017

I could repro that in Edge 15 and filed https://developer.microsoft.com/en-us/microsoft-edge/platform/issues/12047069/

@gsnedders
Copy link
Member Author

#5288 has https://travis-ci.org/w3c/web-platform-tests/jobs/233024736 which just timed out despite running a single test

@zcorpan
Copy link
Member

zcorpan commented May 19, 2017

Got "OK: 9/10, TIMEOUT: 1/10" in #5991 (comment)

@bobholt
Copy link
Contributor

bobholt commented May 19, 2017

As mentioned in IRC, it's possible that MISSING/TIMEOUT tests are due to the Sauce account limit on concurrent jobs. @plehegar can you check on that or somehow share access with me so I can do some digging?

@bobholt
Copy link
Contributor

bobholt commented Oct 4, 2017

Edge is still an issue. We need to fix it in Sauce, see if Browserstack offers a more reliable solution, or set up our own Windows VM for the CI flakiness.

@plehegar
Copy link
Member

plehegar commented Oct 4, 2017

@bobholt sorry, didn't see your question before. do you still need access to the sauce account?

@bobholt
Copy link
Contributor

bobholt commented Oct 4, 2017

@plehegar I believe @lukebjerring is going to be taking on some of my infra responsibilities after next week. Once he's ramped up, I'll ask him to reach out to you to figure this out.

@gsnedders
Copy link
Member Author

#6327 is part of this when it comes to reftests (and that's blocked on MS).

@foolip
Copy link
Member

foolip commented Jan 24, 2018

@mattl, do we have enough data in https://pulls.web-platform-tests.org/ to be able to work out how often there's a problem with any kind of run? At some point we want flakiness in all browsers to not go unnoticed, but it's scary to make failures blocking in Travis without knowing how often it happens.

Given that we actually don't have any work planned for this soon, I think it's more honest to call this a backlog issue. If people keep noticing, or we have some stats about it, then we should raise the priority again.

@foolip
Copy link
Member

foolip commented Apr 4, 2018

Sauce is gone now, as of #9903, and with it Edge stability runs.

@foolip foolip closed this as completed Apr 4, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants