New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 804985 link

Starred by 1 user

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: ----


Sign in to add a comment

unclear that findit is doing something

Project Member Reported by jochen@chromium.org, Jan 23 2018

Issue description

Page URL: https://findit-for-me.appspot.com/waterfall/flake?redirect=1&key=ag9zfmZpbmRpdC1mb3ItbWVyjwELEhdNYXN0ZXJGbGFrZUFuYWx5c2lzUm9vdCJZY2hyb21pdW0ud2luL1dpbjcgVGVzdHMgKGRiZykoMSkvNjIxNTkvYnJvd3Nlcl90ZXN0cy9Rbkp2ZDNObGNsUmxjM1F1VjJsdVpHOTNUM0JsYmtOc2IzTmwMCxITTWFzdGVyRmxha2VBbmFseXNpcxgBDA

 Description:

after entering the information about the flake I'm interested in, I get a somewhat coarse graph, and nothing happens.

How can I know that the tool is doing something?
 

Comment 1 by st...@chromium.org, Jan 23 2018

Owner: wylieb@chromium.org
Status: Assigned (was: Unconfirmed)
Assigned to Brandon for investigation and further follow-up.

Comment 2 by wylieb@chromium.org, Jan 23 2018

Hey jochen@, in the top-right there's a status bar for the analysis. I circled what mine looks like here:

https://screenshot.googleplex.com/7Q0Mnc3OfKu.png

I can see why it's hard to tell that the analysis is running, though. It's tough because the analyses take 2ish hours, and we don't know exactly when they'll finish. This sort of rules out a progress bar. Any suggestions on how we can improve would be greatly appreciated.


Comment 3 by wylieb@chromium.org, Jan 23 2018

Apologies. Looks like this is possibly related to a deadloop case I'm looking into right now.

Comment 4 by wylieb@chromium.org, Jan 23 2018

Looks like it's stuck on build 62138.

Comment 5 by wylieb@chromium.org, Jan 23 2018

The root cause of this was fixed: https://chromium-review.googlesource.com/c/infra/infra/+/615095

I think we should purge old app engine instances, and this should take care of it.

Comment 6 by st...@chromium.org, Jan 24 2018

Re #3, this specific analysis was not running into the dead loop. Instead, the analysis was completed (maybe bailed out for some reason), but the status was not updated accordingly.

Comment 7 by jochen@chromium.org, Jan 24 2018

It's still showing "running" for me - but if it bailed out, should I start a new run?

It could also show me a link to the current swarming jobs, so I can at least check myself that the build is progressing?

Comment 8 by wylieb@chromium.org, Jan 24 2018

Blockedon: 805529

Comment 9 by wylieb@chromium.org, Jan 31 2018

There's some stuff that we can do, for sure. I opened an issue we can investigate moving forward. In the mean-time, I did do a rerun of the analysis you were looking at.

Rerun: https://findit-for-me.appspot.com/waterfall/flake?key=ag9zfmZpbmRpdC1mb3ItbWVyjwELEhdNYXN0ZXJGbGFrZUFuYWx5c2lzUm9vdCJZY2hyb21pdW0ud2luL1dpbjcgVGVzdHMgKGRiZykoMSkvNjIxNTkvYnJvd3Nlcl90ZXN0cy9Rbkp2ZDNObGNsUmxjM1F1VjJsdVpHOTNUM0JsYmtOc2IzTmwMCxITTWFzdGVyRmxha2VBbmFseXNpcxgCDA
thanks for the rerun. The results, however, don't look like it actually found a culprit :/
Apologies that the results weren't helpful. Findit did identify a regression range: 

https://chromium.googlesource.com/chromium/src/+log/b4c871c98d03bf300220b1d6f27bf868eac718fe..64cae83473fce17e2d013ba1ca89d80b5e08c586?pretty=fuller

Looked through them, but not particularly convinced that this regression range is correct. Mind taking a look to confirm?

The outcome is just slightly over our threshold for something that constitutes flaky.
Blockedon: 808684
I filed an issue around this. We have an upper bound of 98% pass rate to determine something flaky. I lowered it to avoid this sort of thing with the rerun I did:

https://findit-for-me.appspot.com/waterfall/flake?key=ag9zfmZpbmRpdC1mb3ItbWVyjwELEhdNYXN0ZXJGbGFrZUFuYWx5c2lzUm9vdCJZY2hyb21pdW0ud2luL1dpbjcgVGVzdHMgKGRiZykoMSkvNjIxNTkvYnJvd3Nlcl90ZXN0cy9Rbkp2ZDNObGNsUmxjM1F1VjJsdVpHOTNUM0JsYmtOc2IzTmwMCxITTWFzdGVyRmxha2VBbmFseXNpcxgDDA

I'll follow up again once this analysis completes to see if my change has the desired effect.
Thx for rerunning the analysis. I agree that the regression range looks unlikely.

It seems that the graph stops going into the past as soon as it finds one build that is above the threshold? Is it possible to go further to verify that the test is indeed stable before?
No problem, sorry Findit couldn't get answers for you. The commit log coming out of the rerun looks more likely (if for no other reason that it's larger)

https://chromium.googlesource.com/chromium/src/+log/22acad3fa9d770e7a5f841fb25bb3a7649e554db..3b38e2d000fc51de3f573172c4912437acc628d4?pretty=fuller

It's not clear to me what exactly is flaking your test, since it times out on failure. See this test log for details on the test runs:

https://chromium-swarm.appspot.com/task?id=3b72187b49288510&refresh=10&show_raw=1

| It seems that the graph stops going into the past as soon as it finds one build that is above the threshold?  | Is it possible to go further to verify that the test is indeed stable before?

Findit makes the assumption that a point we know is stable is equivalent to multiple adjacent points are stable. This regression range looks pretty convincing to me. The list is pretty long, but if try-jobs were ran against this range, we might find the culprit. Problem is that we currently don't have a way to force try-jobs to run.

This is actionable from my end, I'll file a bug for it.
Blockedon: 809158
Filed a bug that allows admins to force try-jobs to run. In this case where your flakiness problem might be upstream, it would be helpful to force try-jobs to run to confirm what's wrong.
We've moved forward with a change to run try-jobs for these cases regardless of confidence. That'll help determine the true culprit of this. Hopefully this will be deployed tomorrow!
I can repro the failures locally, and it seems that the timeouts are just because the test sometimes takes a long time, i.e., increasing the timeout will make it pass reliably.
Missed this weeks deployment, we're thinking next week. I posted a regression range that might be useful in the meantime.

https://chromium.googlesource.com/chromium/src/+log/22acad3fa9d770e7a5f841fb25bb3a7649e554db..3b38e2d000fc51de3f573172c4912437acc628d4?pretty=fuller
I looked at BrowserTest.WindowOpenClose3 (https://findit-for-me.appspot.com/waterfall/flake?key=ag9zfmZpbmRpdC1mb3ItbWVykwELEhdNYXN0ZXJGbGFrZUFuYWx5c2lzUm9vdCJdY2hyb21pdW0ud2luL1dpbjcgVGVzdHMgKGRiZykoMSkvNjU5MzIvYnJvd3Nlcl90ZXN0cy9Rbkp2ZDNObGNsUmxjM1F1VjJsdVpHOTNUM0JsYmtOc2IzTmxNdz09DAsSE01hc3RlckZsYWtlQW5hbHlzaXMYAQw) however, the regression range there is incorrect as well :/

I've split up WindowOpenClose into WindowOpenClose{1,2,3} and 3 is the one that's flaky on windows, so the correct build to blame would be the one where I split up the test, but it points to a random other build

Comment 21 by wylieb@google.com, Mar 21 2018

Thanks for pointing that out!

The analysis you linked to has only 40% confidence, so it's findings shouldn't be considered correct. Having said that, it looks like on build 65922, the test was fully passing and the next build 65923 it was slightly flaky so according to Findit this is a good reason to consider these two builds as the regression range. From what I can tell, it might have been the chrome-release-bot change that introduced the flakiness (at least from Findit's perspective). 

https://chromium.googlesource.com/chromium/src/+/9b1c4662c8d16ef7a84ce3ec47ff160a5293bb7b

When did you split the test up? This analysis is roughly a month old.


Blockedon: 824448
Owner: ----
Status: Available (was: Assigned)

Sign in to add a comment