MemoryTracingTest.BrowserInitiatedDump is flaky on Linux |
|||
Issue description
This test periodically fails like so:
[ RUN ] MemoryTracingTest.BrowserInitiatedDump
[18298:18298:0407/094525.486910:3054404219:WARNING:audio_manager.cc(295)] Multiple instances of AudioManager detected
[18298:18298:0407/094525.486971:3054404262:WARNING:audio_manager.cc(254)] Multiple instances of AudioManager detected
Xlib: extension "RANDR" missing on display ":99".
unknown file: Failure
Unexpected mock function call - returning directly.
Function call: OnMemoryDumpDone(0, false)
Google Mock tried the following 1 expectation, but it didn't match:
../../content/browser/tracing/memory_tracing_browsertest.cc:316: EXPECT_CALL(*this, OnMemoryDumpDone(_, true ))...
Expected arg #1: is equal to true
Actual: false
Expected: to be called once
Actual: never called - unsatisfied and active
../../content/browser/tracing/memory_tracing_browsertest.cc:316: Failure
Actual function call count doesn't match EXPECT_CALL(*this, OnMemoryDumpDone(_, true ))...
Expected: to be called once
Actual: never called - unsatisfied and active
[ FAILED ] MemoryTracingTest.BrowserInitiatedDump, where TypeParam = and GetParam() = (193 ms)
For example: https://uberchromegw.corp.google.com/i/chromium.linux/builders/Linux%20Tests/builds/54499
Marking for Speed>Tracing triage.
,
Apr 7 2017
Let's disabled the test for the moment. I start wondering whether this is something similar to Issue 656729 , where the trace config file is left around, and we start this test with a trace config, hence failing because tracing is spontaneously enabled. Can somebody give me a link to the flakiness dashboard that shows when this test started being flaky?
,
Apr 7 2017
I will write a cl today which adds a bunch of logs to all the spots that cause dump to fail. Let's see if we can catch these better in bots.
,
Apr 7 2017
Test disable is going in now: https://codereview.chromium.org/2804313002 Flakiness dash is here: https://test-results.appspot.com/dashboards/flakiness_dashboard.html#tests=MemoryTracingTest.BrowserInitiatedDump&testType=browser_side_navigation_content_browsertests
,
Apr 7 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/1d043d260a440f6ab4ed5f1e78641823e95961c3 commit 1d043d260a440f6ab4ed5f1e78641823e95961c3 Author: ellyjones <ellyjones@chromium.org> Date: Fri Apr 07 18:04:33 2017 tracing: disable MemoryTracingTest.BrowserInitiatedDump This test is flaky on Linux BUG=709524 TBR=dsinclair@chromium.org Review-Url: https://codereview.chromium.org/2804313002 Cr-Commit-Position: refs/heads/master@{#462920} [modify] https://crrev.com/1d043d260a440f6ab4ed5f1e78641823e95961c3/content/browser/tracing/memory_tracing_browsertest.cc
,
Apr 10 2017
This might be a dupe of Issue 529678
,
Apr 12 2017
I can reproduce this locally. 5 runs failed out of 200 runs on my machine. I will take a look.
,
Apr 12 2017
Weird... This comment implies that StartTracing waits for all processes to ack receiving the BeginTracing message: https://cs.chromium.org/chromium/src/content/browser/tracing/tracing_controller_impl.cc?l=249 But, there is no IPC message for acking BeginTracing and in fact start_tracing_done_callback is called as soon as BeginTracing is sent to all processes, without waiting for acks: https://cs.chromium.org/chromium/src/content/browser/tracing/tracing_controller_impl.cc?dr=CSs&l=258 I think this is the root cause of the flakiness. Sometimes, a child receives requests for dumps before it receives the BeginTracing message. My hypothesis is that, and I'm not sure at all, the flakiness was not happening before factoring out memory-infra from tracing controller because tracing and memory-infra messages were using the same message filter and the order of IPCs would be preserved, but now they are using different channels...maybe...?
,
Apr 13 2017
Since I'm already in the process of mojofying tracing, I think it makes sense to address this in the tracing service instead of adding a new IPC message to ack BeginTracing. I will block this on issue 640235 . primiano@, what do you think?
,
Apr 13 2017
> Weird... This comment implies that StartTracing waits for all processes to ack receiving the BeginTracing message: > https://cs.chromium.org/chromium/src/content/browser/tracing/tracing_controller_impl.cc?l=249 This is only about in-process acks by agents (AFAIK agents are all in the browser process, there is no mojo/IPC involved). That is about waiting on the OnStartAgentTracingAcked. > But, there is no IPC message for acking BeginTracing and in fact start_tracing_done_callback is called as soon as BeginTracing is sent to all processes, without waiting for acks: Right, I think BeginTracing has no acks and start_tracing_done_callback is all about "the browser process is ready with all its agents", not about ready state of other processes. > I think this is the root cause of the flakiness. Sometimes, a child receives requests for dumps before it receives the BeginTracing message. Uhm, I thought IPCs were linearized, so if browser sends (start, dump) the child process should receive them necessarily in order. > My hypothesis is that, and I'm not sure at all, the flakiness was not happening before factoring out memory-infra from tracing controller because tracing and memory-infra messages were using the same message filter and the order of IPCs would be preserved, but now they are using different channels...maybe...? Aaaa right, now the dump message is using a different mechanism (Mojo) which is, by design, not linearized. > Since I'm already in the process of mojofying tracing, I think it makes sense to address this in the tracing service instead of adding a new IPC message to ack BeginTracing. I will block this on issue 640235 . primiano@, what do you think? Yeah definitely agree. So I think the real problem is that the test is too aggressive in its expectations right now. The only real problem here could be telemetry/devtools: if we do Tracing.Start and Tracing.CreateMemoryDump (or whatever are the right name one would expect that the 2nd would work. Having said this, is not a case we care about today, so fine to wait for the mojoification, no rush. Out of cusiosity, how many tests depend on this behavior now? Is it just one (in which case disable it) or more?
,
Apr 13 2017
We also have slow reports which depend on start tracing done callback to invoke dump. The TestMemoryInfra browser test should also be flaky since the same code is used.
,
Aug 30 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/b3f624bd0e196abcec3e29c18c353b7c6724fc87 commit b3f624bd0e196abcec3e29c18c353b7c6724fc87 Author: Siddhartha <ssid@chromium.org> Date: Wed Aug 30 01:14:53 2017 Enable Memory-infra tracing browser tests The tests were disabled due to the dumps failing because of requesting dumps before tracing is enabled in renderers. The fix is blocked on tracing servicification work. Meanwhile enable the tests without check for dump success to ensure tracing does not crash. Also do not fail process dump if MDM is not initialized. Fix test memory dump configs to not include default categories. BUG=708487, 709524 TBR=primiano@chromium.org Change-Id: Ib5f4f69c76bdba889828ab8b2907fd03b770d7fe Reviewed-on: https://chromium-review.googlesource.com/634485 Commit-Queue: Siddhartha S <ssid@chromium.org> Reviewed-by: Ehsan Chiniforooshan <chiniforooshan@chromium.org> Cr-Commit-Position: refs/heads/master@{#498328} [modify] https://crrev.com/b3f624bd0e196abcec3e29c18c353b7c6724fc87/base/trace_event/memory_dump_manager.cc [modify] https://crrev.com/b3f624bd0e196abcec3e29c18c353b7c6724fc87/base/trace_event/memory_dump_manager.h [modify] https://crrev.com/b3f624bd0e196abcec3e29c18c353b7c6724fc87/base/trace_event/trace_config_memory_test_util.h [modify] https://crrev.com/b3f624bd0e196abcec3e29c18c353b7c6724fc87/chrome/test/BUILD.gn [rename] https://crrev.com/b3f624bd0e196abcec3e29c18c353b7c6724fc87/chrome/test/base/memory_tracing_browsertest.cc [modify] https://crrev.com/b3f624bd0e196abcec3e29c18c353b7c6724fc87/content/browser/tracing/memory_tracing_browsertest.cc |
|||
►
Sign in to add a comment |
|||
Comment 1 by hjd@chromium.org
, Apr 7 2017