Flakes are found in None in telemetry_gpu_integration_test. |
||||||||
Issue descriptionNone* in telemetry_gpu_integration_test is flaky. Findit has detected 4 flake occurrences of tests below within the past 24 hours: gpu_tests.webgl_conformance_integration_test.WebGLConformanceIntegrationTest.WebglConformance_conformance_textures_svg_image_tex_2d_rgba_rgba_unsigned_byte gpu_tests.webgl_conformance_integration_test.WebGLConformanceIntegrationTest.WebglConformance_conformance_textures_svg_image_tex_2d_rgb_rgb_unsigned_byte gpu_tests.webgl_conformance_integration_test.WebGLConformanceIntegrationTest.WebglConformance_conformance_textures_svg_image_tex_2d_rgba_rgba_unsigned_short_5_5_5_1 gpu_tests.webgl_conformance_integration_test.WebGLConformanceIntegrationTest.WebglConformance_conformance_textures_svg_image_tex_2d_luminance_alpha_luminance_alpha_unsigned_byte Please try to find and revert the culprit if the culprit is obvious. Otherwise please find an appropriate owner.
,
Jan 15
Thanks kmarshall@ for reporting. chanli@, can you help me understand why this report is poor quality? (No links to the failures in the FindIt tool, and "None" being reported as the test name.) Here are links to the four flakes: gpu_tests.webgl_conformance_integration_test.WebGLConformanceIntegrationTest.WebglConformance_conformance_textures_svg_image_tex_2d_rgba_rgba_unsigned_byte https://findit-for-me.appspot.com/flake/occurrences?key=ag9zfmZpbmRpdC1mb3ItbWVyzwELEgVGbGFrZSLDAWNocm9taXVtQHRlbGVtZXRyeV9ncHVfaW50ZWdyYXRpb25fdGVzdEBncHVfdGVzdHMud2ViZ2xfY29uZm9ybWFuY2VfaW50ZWdyYXRpb25fdGVzdC5XZWJHTENvbmZvcm1hbmNlSW50ZWdyYXRpb25UZXN0LldlYmdsQ29uZm9ybWFuY2VfY29uZm9ybWFuY2VfdGV4dHVyZXNfc3ZnX2ltYWdlX3RleF8yZF9yZ2JhX3JnYmFfdW5zaWduZWRfYnl0ZQw gpu_tests.webgl_conformance_integration_test.WebGLConformanceIntegrationTest.WebglConformance_conformance_textures_svg_image_tex_2d_rgb_rgb_unsigned_byte https://findit-for-me.appspot.com/flake/occurrences?key=ag9zfmZpbmRpdC1mb3ItbWVyzQELEgVGbGFrZSLBAWNocm9taXVtQHRlbGVtZXRyeV9ncHVfaW50ZWdyYXRpb25fdGVzdEBncHVfdGVzdHMud2ViZ2xfY29uZm9ybWFuY2VfaW50ZWdyYXRpb25fdGVzdC5XZWJHTENvbmZvcm1hbmNlSW50ZWdyYXRpb25UZXN0LldlYmdsQ29uZm9ybWFuY2VfY29uZm9ybWFuY2VfdGV4dHVyZXNfc3ZnX2ltYWdlX3RleF8yZF9yZ2JfcmdiX3Vuc2lnbmVkX2J5dGUM gpu_tests.webgl_conformance_integration_test.WebGLConformanceIntegrationTest.WebglConformance_conformance_textures_svg_image_tex_2d_rgba_rgba_unsigned_short_5_5_5_1 https://findit-for-me.appspot.com/flake/occurrences?key=ag9zfmZpbmRpdC1mb3ItbWVy2AELEgVGbGFrZSLMAWNocm9taXVtQHRlbGVtZXRyeV9ncHVfaW50ZWdyYXRpb25fdGVzdEBncHVfdGVzdHMud2ViZ2xfY29uZm9ybWFuY2VfaW50ZWdyYXRpb25fdGVzdC5XZWJHTENvbmZvcm1hbmNlSW50ZWdyYXRpb25UZXN0LldlYmdsQ29uZm9ybWFuY2VfY29uZm9ybWFuY2VfdGV4dHVyZXNfc3ZnX2ltYWdlX3RleF8yZF9yZ2JhX3JnYmFfdW5zaWduZWRfc2hvcnRfNV81XzVfMQw gpu_tests.webgl_conformance_integration_test.WebGLConformanceIntegrationTest.WebglConformance_conformance_textures_svg_image_tex_2d_luminance_alpha_luminance_alpha_unsigned_byte https://findit-for-me.appspot.com/flake/occurrences?key=ag9zfmZpbmRpdC1mb3ItbWVy5QELEgVGbGFrZSLZAWNocm9taXVtQHRlbGVtZXRyeV9ncHVfaW50ZWdyYXRpb25fdGVzdEBncHVfdGVzdHMud2ViZ2xfY29uZm9ybWFuY2VfaW50ZWdyYXRpb25fdGVzdC5XZWJHTENvbmZvcm1hbmNlSW50ZWdyYXRpb25UZXN0LldlYmdsQ29uZm9ybWFuY2VfY29uZm9ybWFuY2VfdGV4dHVyZXNfc3ZnX2ltYWdlX3RleF8yZF9sdW1pbmFuY2VfYWxwaGFfbHVtaW5hbmNlX2FscGhhX3Vuc2lnbmVkX2J5dGUM
,
Jan 15
Looking at one of the failures: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/mac_chromium_rel_ng/223040 https://chromium-swarm.appspot.com/task?id=4254c0634ce39d10&refresh=10&show_raw=1 this looks like a bug in the DOM. A quick search through the bug database turned up Issue 914739 as a likely cause, though things seem to have stabilized. yhirano@ could you please investigate and confirm this? Thanks. [4662:775:0111/000801.196297:FATAL:frame_or_imported_document.cc(62)] Check failed: !document_. 0 Chromium Framework 0x0000000118c770cf base::debug::StackTrace::StackTrace(unsigned long) + 31 1 Chromium Framework 0x0000000118b71d8f logging::LogMessage::~LogMessage() + 223 2 Chromium Framework 0x000000011df8c257 blink::FrameOrImportedDocument::UpdateDocument(blink::Document&) + 103 3 Chromium Framework 0x000000011df7bc77 blink::FrameFetchContext::ProvideDocumentToContext(blink::Document*) + 103 4 Chromium Framework 0x000000011d45008f blink::Document::Document(blink::DocumentInit const&, unsigned char) + 5999 5 Chromium Framework 0x000000011d5a183e blink::XMLDocument::XMLDocument(blink::DocumentInit const&, unsigned char) + 14 6 Chromium Framework 0x000000011d479a3f blink::XMLDocument* blink::ConstructTrait<blink::XMLDocument, true>::Construct<blink::DocumentInit const&, int>(blink::DocumentInit const&&&, int&&) + 399 7 Chromium Framework 0x000000011d4a5c6f blink::DOMImplementation::createDocument(WTF::String const&, blink::DocumentInit const&, bool) + 111 8 Chromium Framework 0x000000011d808d8d blink::LocalDOMWindow::InstallNewDocument(WTF::String const&, blink::DocumentInit const&, bool) + 333 9 Chromium Framework 0x000000011d820549 blink::LocalFrame::ForceSynchronousDocumentInstall(WTF::AtomicString const&, scoped_refptr<blink::SharedBuffer>) + 345 10 Chromium Framework 0x000000011e230920 blink::SVGImage::DataChanged(bool) + 1520 11 Chromium Framework 0x000000011dfc5140 blink::ImageResourceContent::UpdateImage(scoped_refptr<blink::SharedBuffer>, blink::ResourceStatus, blink::ImageResourceContent::UpdateImageOption, bool, bool) + 560 12 Chromium Framework 0x000000011dfc0383 blink::ImageResource::Finish(base::TimeTicks, base::SingleThreadTaskRunner*) + 147 13 Chromium Framework 0x0000000117f7759e blink::ResourceFetcher::HandleLoaderFinish(blink::Resource*, base::TimeTicks, blink::ResourceFetcher::LoaderFinishType, unsigned int, bool, std::__1::vector<network::cors::PreflightTimingInfo, std::__1::allocator<network::cors::PreflightTimingInfo> > const&) + 1886 14 Chromium Framework 0x0000000117f96798 blink::ResourceLoader::DidFinishLoading(base::TimeTicks, long long, long long, long long, bool, std::__1::vector<network::cors::PreflightTimingInfo, std::__1::allocator<network::cors::PreflightTimingInfo> > const&) + 312 15 Chromium Framework 0x000000011ef02e36 content::WebURLLoaderImpl::Context::OnCompletedRequest(network::URLLoaderCompletionStatus const&) + 582 16 Chromium Framework 0x000000011ef036eb content::WebURLLoaderImpl::RequestPeerImpl::OnCompletedRequest(network::URLLoaderCompletionStatus const&) + 107 17 Chromium Framework 0x000000011eeec210 content::ResourceDispatcher::OnRequestComplete(int, network::URLLoaderCompletionStatus const&) + 1408 18 Chromium Framework 0x000000011eefbf99 content::URLResponseBodyConsumer::OnReadable(unsigned int) + 1017 19 Chromium Framework 0x000000011539eb17 mojo::SimpleWatcher::DiscardReadyState(base::RepeatingCallback<void (unsigned int)> const&, unsigned int, mojo::HandleSignalsState const&) + 103 20 Chromium Framework 0x0000000118d0c8ab mojo::SimpleWatcher::OnHandleReady(int, unsigned int, mojo::HandleSignalsState const&) + 379 21 Chromium Framework 0x0000000118d0cdd1 void base::internal::Invoker<base::internal::BindState<void (mojo::SimpleWatcher::*)(int, unsigned int, mojo::HandleSignalsState const&), base::WeakPtr<mojo::SimpleWatcher>, int, unsigned int, mojo::HandleSignalsState>, void ()>::RunImpl<void (mojo::SimpleWatcher::* const&)(int, unsigned int, mojo::HandleSignalsState const&), std::__1::tuple<base::WeakPtr<mojo::SimpleWatcher>, int, unsigned int, mojo::HandleSignalsState> const&, 0ul, 1ul, 2ul, 3ul>(void (mojo::SimpleWatcher::* const&&&)(int, unsigned int, mojo::HandleSignalsState const&), std::__1::tuple<base::WeakPtr<mojo::SimpleWatcher>, int, unsigned int, mojo::HandleSignalsState> const&&&, std::__1::integer_sequence<unsigned long, 0ul, 1ul, 2ul, 3ul>) + 193 22 Chromium Framework 0x0000000118b5b531 base::debug::TaskAnnotator::RunTask(char const*, base::PendingTask*) + 305 23 Chromium Framework 0x0000000118c0accd base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::DoWorkImpl(base::TimeTicks*) + 685 24 Chromium Framework 0x0000000118b8f073 base::MessagePumpCFRunLoopBase::RunWork() + 51 25 Chromium Framework 0x0000000118b74d1a base::mac::CallWithEHFrame(void () block_pointer) + 10 26 Chromium Framework 0x0000000118b8e84f base::MessagePumpCFRunLoopBase::RunWorkSource(void*) + 63 27 CoreFoundation 0x00007fff423d5a11 __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 17 28 CoreFoundation 0x00007fff4248f42c __CFRunLoopDoSource0 + 108 29 CoreFoundation 0x00007fff423b8470 __CFRunLoopDoSources0 + 208 30 CoreFoundation 0x00007fff423b78ed __CFRunLoopRun + 1293 31 CoreFoundation 0x00007fff423b7153 CFRunLoopRunSpecific + 483 32 Foundation 0x00007fff444b3f26 -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 277 33 Chromium Framework 0x0000000118b8f8e1 base::MessagePumpNSRunLoop::DoRun(base::MessagePump::Delegate*) + 113 34 Chromium Framework 0x0000000118b8e1bf base::MessagePumpCFRunLoopBase::Run(base::MessagePump::Delegate*) + 127 35 Chromium Framework 0x0000000118c0b51b base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::Run(bool) + 155 36 Chromium Framework 0x0000000118bc7bb5 base::RunLoop::Run() + 789 37 Chromium Framework 0x000000011f0be9b7 content::RendererMain(content::MainFunctionParams const&) + 1399 38 Chromium Framework 0x000000011853fdc3 content::ContentMainRunnerImpl::Run(bool) + 435 39 Chromium Framework 0x000000011bb3de1b service_manager::Main(service_manager::MainParams const&) + 3035 40 Chromium Framework 0x000000011853eee4 content::ContentMain(content::ContentMainParams const&) + 68 41 Chromium Framework 0x000000011451400f ChromeMain + 175 42 Chromium Helper 0x000000010f4d89c0 main + 480 43 libdyld.dylib 0x00007fff6a1f4015 start + 1
,
Jan 15
,
Jan 15
Android failures are the same issue: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/android-marshmallow-arm64-rel/165896 https://chromium-swarm.appspot.com/task?id=4254cadf2d353b10&refresh=10&show_raw=1 [FATAL:frame_or_imported_document.cc(62)] Check failed: !document_. ' Stack Trace: RELADDR FUNCTION FILE:LINE 000000000006a144 <UNKNOWN> /system/lib64/libc.so 00000000000678d4 <UNKNOWN> /system/lib64/libc.so 0000000000023838 <UNKNOWN> /system/lib64/libc.so 000000000001dfd8 <UNKNOWN> /system/lib64/libc.so 000000000439116c base::debug::BreakDebugger() ??:0:0 00000000042e0f88 logging::LogMessage::~LogMessage() ??:0:0 000000000689f5f8 blink::FrameOrImportedDocument::UpdateDocument(blink::Document&) ??:0:0 00000000068972ac blink::FrameFetchContext::ProvideDocumentToContext(blink::Document*) ??:0:0 00000000062fbcf0 blink::Document::Document(blink::DocumentInit const&, unsigned char) ??:0:0 00000000063a2490 blink::XMLDocument::XMLDocument(blink::DocumentInit const&, unsigned char) ??:0:0 00000000063114e0 blink::XMLDocument* blink::ConstructTrait<blink::XMLDocument, true>::Construct<blink::DocumentInit const&, blink::DocumentClass>(blink::DocumentInit const&, blink::DocumentClass&&) ??:0:0 0000000006326284 blink::XMLDocument::CreateSVG(blink::DocumentInit const&) ??:0:0 0000000006326b30 blink::DOMImplementation::createDocument(WTF::String const&, blink::DocumentInit const&, bool) ??:0:0 00000000064d14a4 blink::LocalDOMWindow::CreateDocument(WTF::String const&, blink::DocumentInit const&, bool) ??:0:0 00000000064d15ac blink::LocalDOMWindow::InstallNewDocument(WTF::String const&, blink::DocumentInit const&, bool) ??:0:0 00000000064dc320 blink::LocalFrame::ForceSynchronousDocumentInstall(WTF::AtomicString const&, scoped_refptr<blink::SharedBuffer>) ??:0:0 00000000069d6718 blink::SVGImage::DataChanged(bool) ??:0:0 000000000605cd44 blink::Image::SetData(scoped_refptr<blink::SharedBuffer>, bool) ??:0:0 00000000068bbf80 blink::ImageResourceContent::UpdateImage(scoped_refptr<blink::SharedBuffer>, blink::ResourceStatus, blink::ImageResourceContent::UpdateImageOption, bool, bool) ??:0:0 ...
,
Jan 15
Thank you for reporting this to us. Bug 921130 was created for a group of flakes, which should be in the same suite and always happen together. This is a new feature, so there are some bugs we need to fix: 1. There is supposed to be a followup comment with the link to a page for all mentioned failures, I need to check why the comment does't post 2. For flakes we don't get the suite info, we should not group them in the first place. BTW, what should we use for tests in gpu_tests as suite_name? I'll work on the fixes (should be quick).
,
Jan 15
Here is a link to the flakes in this bug: https://findit-for-me.appspot.com/ranked-flakes?bug_id=921130
,
Jan 16
I'm confused, because I removed the DCHECK you are referring to BEFORE landing.
,
Jan 16
(6 days ago)
yhirano@: Hmm. I randomly chose https://ci.chromium.org/p/chromium/builders/luci.chromium.try/android-marshmallow-arm64-rel/165896 to look at, and it turned out that that was a tryjob of one of your earlier patch sets which must have still had the DCHECK in it. I didn't realize that the only other recent flake on Android of the first one: https://findit-for-me.appspot.com/flake/occurrences?key=ag9zfmZpbmRpdC1mb3ItbWVy5QELEgVGbGFrZSLZAWNocm9taXVtQHRlbGVtZXRyeV9ncHVfaW50ZWdyYXRpb25fdGVzdEBncHVfdGVzdHMud2ViZ2xfY29uZm9ybWFuY2VfaW50ZWdyYXRpb25fdGVzdC5XZWJHTENvbmZvcm1hbmNlSW50ZWdyYXRpb25UZXN0LldlYmdsQ29uZm9ybWFuY2VfY29uZm9ybWFuY2VfdGV4dHVyZXNfc3ZnX2ltYWdlX3RleF8yZF9sdW1pbmFuY2VfYWxwaGFfbHVtaW5hbmNlX2FscGhhX3Vuc2lnbmVkX2J5dGUM https://ci.chromium.org/p/chromium/builders/luci.chromium.try/android-marshmallow-arm64-rel/166193 looked like a complete device failure. chanli@, I think that FindIt's analysis might have failed here. The tryjobs that showed the flakes were dry runs, not actual CQ attempts, so by a strict definition, they weren't a CQ false rejection. The grouping of this failure into the other ones which were essentially device failure was also not ideal, though I understand it might have been difficult to figure out that the Android device failed just from the failed tests' JSON output, since it looked like basically all the tests failed. For the Telemetry-based GPU tests, can the suite name be taken from the step name? In this case, it would be "webgl_conformance_tests". If that's not possible, then can it be the second part of the full test name (in this case, "webgl_conformance_integration_test")? chanli@, I'm assigning this to you because there was no actual flake here. The one flake that the system detected in yhirano@'s dry runs seemed to cause incorrect grouping with other device failures.
,
Jan 16
(6 days ago)
kbr@, I'll make a quick change to use the canonical step name as the suite name for Telemetry-based GPU tests. About identifying Telemetry-based GPU tests, I assume all of their names start with gpu_tests? Or is there a better way to identify them? For filtering out failures caused by device failure rather than tests, I think this is a good example to be identified as a 'flaky step' as a whole rather than a lot of 'flaky tests'. stgao@ has some plan on adding that level of support, though we don't have a hard timeline yet.
,
Jan 17
(6 days ago)
Is it possible to look at the name of the isolate? If so, all of the Telemetry-based GPU tests use the telemetry_gpu_integration_test isolate. Otherwise, yes, for now if the test name starts with "gpu_tests." that's a good indicator. (rmhasan@ may be changing the naming convention of the tests soon, however, so it would be best to not rely on that.)
,
Jan 17
(6 days ago)
Yes we can get the isolate name. Thanks for the information.
,
Jan 18
(4 days ago)
The following revision refers to this bug: https://chromium.googlesource.com/infra/infra/+/271fede7a484a0a7c12bfc4e608acef5ebbd8963 commit 271fede7a484a0a7c12bfc4e608acef5ebbd8963 Author: Chan <chanli@chromium.org> Date: Fri Jan 18 17:46:04 2019 [Findit] For Telemetry-based GPU tests, use canonical step name as suite name. Bug: 921130 Change-Id: I663d708c429b42ea1a67c533368f948f7144d982 Reviewed-on: https://chromium-review.googlesource.com/c/1416252 Reviewed-by: Kenneth Russell <kbr@chromium.org> Reviewed-by: Shuotao Gao <stgao@chromium.org> Commit-Queue: Chan Li <chanli@chromium.org> Cr-Commit-Position: refs/heads/master@{#20075} [modify] https://crrev.com/271fede7a484a0a7c12bfc4e608acef5ebbd8963/appengine/findit/services/flake_detection/test/detect_flake_occurrences_test.py [modify] https://crrev.com/271fede7a484a0a7c12bfc4e608acef5ebbd8963/appengine/findit/services/flake_detection/detect_flake_occurrences.py
,
Jan 18
(4 days ago)
Just wanted to double check about the concept of suite name. On Findit side, the suite name is defined as the smallest group of the tests that are defined in the same file or directory. For example, * In gtest TabRestoreTest.RestoreTabWithSpecialURL of network_service_browser_tests, the suite name is TabRestoreTest * In layout test third_party/WebKit/LayoutTests/http/tests/devtools/network/network-blocked-reason.js, the suite name is the directory third_party/WebKit/LayoutTests/http/tests/devtools/network/ * In Java Junit/Instrumentation tests org.chromium.chrome.browser.ExampleUiCaptureTest#testCaptureTabSwitcher, the suite name is the Java class name ExampleUiCaptureTest For a GPU test gpu_tests.webgl_conformance_integration_test.WebGLConformanceIntegrationTest.WebglConformance_conformance_textures_svg_image_tex_2d_rgba_rgba_unsigned_byte, we are using its test/isolate target name "webgl_conformance_tests" instead of the smaller group name "gpu_tests.webgl_conformance_integration_test.WebGLConformanceIntegrationTest". This is inconsistent from the other test types. Is this really what we want?
,
Jan 18
(4 days ago)
I see your point. The autogenerated names from Telemetry / typ like "gpu_tests.webgl_conformance_integration_test.WebGLConformanceIntegrationTest" are unwieldy and redundant, and I definitely do not want to use them for this purpose. Other suggestions welcome, but the step name seems OK to me.
,
Jan 18
(4 days ago)
Thanks for the clarification. I'm OK with this special case, but just wanted to double check whether more fine-grained grouping is expected. This will affect the searching on go/ranked-flakes and clustering flakes to file a single bug like this one. |
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by kmarshall@chromium.org
, Jan 14Owner: kbr@chromium.org
Status: Assigned (was: Untriaged)