Issue metadata
Sign in to add a comment
|
Mac ASan errors on chromium.memory for unit_tests don't turn turn the bot red |
||||||||||||||||||||||
Issue descriptionChrome Version: N/A OS: Mac What steps will reproduce the problem? (1) Go to this build result page and see that all steps are green and passing: https://luci-milo.appspot.com/buildbot/chromium.memory/Mac%20ASan%2064%20Tests%20%281%29/31460 (2) Open the test log for unit_tests: https://luci-logdog.appspot.com/v/?s=chromium%2Fbb%2Fchromium.memory%2FMac_ASan_64_Tests__1_%2F31460%2F%2B%2Frecipes%2Fsteps%2Funit_tests%2F0%2Fstdout (3) Search for "heap-use-after-free" and find results!! What is the expected result? "heap-use-after-free" should fail the unit_tests step and turn the bot red. What happens instead? The unit_tests step is passing. Please use labels and text to provide additional information. For graphics-related bugs, please copy/paste the contents of the about:gpu page at the end of this report.
,
Jun 20 2017
Your diagnosis is correct, the test is crashing during the initial run, and passing on a retry, so it is being treated as a flake. It's not clear to me that we should have a different policy for sanitizer failures than any other; obviously, such a crash is bad, but so is any crash ...
,
Jun 20 2017
[mac triage] erikchen@ for memory
,
Jun 20 2017
The problem is that these memory errors occur across several runs, and the retry-in-isolation is papering over that fact. It even looks like some of these errors are totally reliable when not run in isolation: https://test-results.appspot.com/dashboards/flakiness_dashboard.html#testType=unit_tests&builder=chromium.memory%3AMac%20ASan%2064%20Tests%20(1). So these are real errors, and they're either problems with the tests themselves (which would then potentially cause corruption in other tests) or even worse, memory corruption in the non-test code/product. I think part of the problem of retry-in-isolation is that some memory errors require a degree of heap activity in order to trigger. I only noticed this because someone reported an ASan failure as bug 734019 . And that test does have a clear memory error in the code, so I don't think that treating this as flake is correct.
,
Jun 20 2017
That same argument can be made about any other kind of failure. I'm not disagreeing with you; it's simply true (a known tradeoff) that retrying failures will lead you to ignore classes of failures, which is why people need to be looking at the flakiness dashboard, which *doesn't* ignore these failures. Unfortunately, we don't currently have effective mechanisms for actively working on these sorts of failures. This is something the ops team is looking at. It will likely require changes in processes like sheriffing as well as just tooling changes, and I am totally fine w/ making such changes if need be. |
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by rsesek@chromium.org
, Jun 19 2017