Mojo/IPC messages sent immediately before a child process exits sometimes get "lost" |
|||||||
Issue descriptionUnder the Fuchsia FYI bots we see infrequent flakes of IPCChannel and Mojo tests, which fall into two observed behaviours: - MOJO_RESULT_FAILED_PRECONDITION result from Mojo WaitForSignals() on a connected channel (e.g issue 810448 ). - Test "hangs" due to IPC "quit" message never being received from the child process (e.g. issue 764015 ). All of these test cases involve tests which launch a child process that sends a message back to the main test process, before exiting. The failures are therefore consistent with Mojo messages occasionally being "dropped" if they are sent immediately before process exit (or possibly if they are sent immediately before Channel teardown?). Existing flake reports matching one of the two symptoms described above will be merged into this bug. New reports of flakes in these tests can be added as comments to this bug if they match, or if in doubt then please file a new bug for us to review and de-dupe if appropriate.
,
Feb 26 2018
Issue 810448 has been merged into this issue.
,
Feb 26 2018
Issue 755848 has been merged into this issue.
,
Feb 26 2018
Issue 816606 has been merged into this issue.
,
Feb 26 2018
Issue 764018 has been merged into this issue.
,
Feb 26 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/d95d4804ffb0644c2bdc32a1b890a9473b45be1e commit d95d4804ffb0644c2bdc32a1b890a9473b45be1e Author: Wez <wez@chromium.org> Date: Mon Feb 26 20:50:47 2018 Add OnChannelError() handler to QuitListener rather than hang on close. Without an OnChannelError() handler on the QuitListener, the test can hang if the IPC channel disconnects before the Quit message is received. Adding an explicit handler will make it easier to diagnose this failure mode. Also migrates DCHECK()s in some existing OnChannelError() handlers in tests to use CHECK(), to avoid those tests potentially hanging in Release builds. Bug: 816606 , 816620 , 764015 Change-Id: Ibe6ffc7ecd02f6029c74a7b3c1190030b656056f Reviewed-on: https://chromium-review.googlesource.com/936427 Reviewed-by: Ken Rockot <rockot@chromium.org> Commit-Queue: Wez <wez@chromium.org> Cr-Commit-Position: refs/heads/master@{#539262} [modify] https://crrev.com/d95d4804ffb0644c2bdc32a1b890a9473b45be1e/ipc/ipc_channel_mojo_unittest.cc [modify] https://crrev.com/d95d4804ffb0644c2bdc32a1b890a9473b45be1e/ipc/ipc_channel_proxy_unittest.cc
,
Feb 26 2018
Issue 815314 has been merged into this issue.
,
Feb 26 2018
,
Feb 26 2018
,
Mar 7 2018
IPCMojoBootstrapTest.ReceiveEmptyMessage appears to have hit this in https://build.chromium.org/p/chromium.fyi/builders/Fuchsia/builds/14550
,
Mar 7 2018
Hmm. We don't actually block shutdown on channels flushing their outgoing queue, but we probably should. It's so rare that a channel queues anything though, I'm surprised any tests would hit this. That is unless something about the Fuchsia implementation can cause writes to not complete immediately?
,
Mar 7 2018
The Channel implementation for Fuchsia actually doesn't implement queuing at all, since the zx_channel_write() API doesn't yet provide a flow-control signal. Potential sources of these message drops seem to be: 1. Incorrect handling of channel-readable vs channel-closed signals in the Channel impl for Fuchsia. 2. Lack of blocking to wait for IO-thread work to complete when exiting. 3. Raciness in QuitWhenIdle vs PostTask(WriteMessage) ordering during exit (i.e. this would most likely be highly specific to the tests, which do a lot of RunLoop+QuitWhenIdle stuff).
,
Mar 7 2018
Crash stack from failures in IPCChannelBadMessageTest look a little more helpful then IPCChannelProxy failures (which lose context due to use of PostTask): [00001.999] 02166.02205> [3:1682880091:0307/212758.430939:1999143:FATAL:ipc_channel_proxy_unittest.cc(75)] Check failed: quit_message_received_. #00: base::debug::StackTrace::StackTrace(unsigned long) at ??:? #01: base::debug::StackTrace::StackTrace() at ??:? #02: logging::LogMessage::~LogMessage() at ??:? #03: (anonymous namespace)::QuitListener::OnChannelError() at ??:? #04: IPC::ChannelMojo::OnPipeError() at ??:? #05: IPC::internal::MessagePipeReader::OnPipeError(unsigned int) at ??:? ... base::Callback glue ... #09: base::internal::Invoker<base::internal::BindState<void (IPC::internal::MessagePipeReader::*)(unsigned int), base::internal::UnretainedWrapper<IPC::internal::MessagePipeReader>, unsigned int>, void ()>::Run(base::internal::BindStateBase*) at ??:? #10: base::OnceCallback<void ()>::Run() && at ??:? #11: mojo::InterfaceEndpointClient::NotifyError(base::Optional<mojo::DisconnectReason> const&) at ??:? #12: IPC::(anonymous namespace)::ChannelAssociatedGroupController::NotifyEndpointOfError(IPC::(anonymous namespace)::ChannelAssociatedGroupController::Endpoint*, bool) at ??:? #13: IPC::(anonymous namespace)::ChannelAssociatedGroupController::OnPipeError() at ??:? ... base::Callback glue ... #17: base::internal::Invoker<base::internal::BindState<void (IPC::(anonymous namespace)::ChannelAssociatedGroupController::*)(), base::internal::UnretainedWrapper<IPC::(anonymous namespace)::ChannelAssociatedGroupController> >, void ()>::Run(base::internal::BindStateBase*) at ??:? #18: base::OnceCallback<void ()>::Run() && at ??:? #19: mojo::Connector::HandleError(bool, bool) at ??:? #20: mojo::Connector::OnHandleReadyInternal(unsigned int) at ??:? #21: mojo::Connector::OnWatcherHandleReady(unsigned int) at ??:? ... base::Callback glue ... #27: mojo::SimpleWatcher::DiscardReadyState(base::RepeatingCallback<void (unsigned int)> const&, unsigned int, mojo::HandleSignalsState const&) at ??:? ... base::Callback glue ... #33: mojo::SimpleWatcher::OnHandleReady(int, unsigned int, mojo::HandleSignalsState const&) at ??:? ... base::Callback glue ... #37: base::internal::Invoker<base::internal::BindState<void (mojo::SimpleWatcher::*)(int, unsigned int, mojo::HandleSignalsState const&), base::WeakPtr<mojo::SimpleWatcher>, int, unsigned int, mojo::HandleSignalsState>, void ()>::Run(base::internal::BindStateBase*) at ??:? #38: base::OnceCallback<void ()>::Run() && at ??:? #39: base::debug::TaskAnnotator::RunTask(char const*, base::PendingTask*) at ??:? ... base::RunLoop stuff ... #46: base::RunLoop::Run() at ??:? #47: (anonymous namespace)::CreateRunLoopAndRun(base::RunLoop**) at ??:? #48: (anonymous namespace)::IPCChannelBadMessageTest::SendQuitMessageAndWaitForIdle() at ??:? #49: (anonymous namespace)::IPCChannelBadMessageTest_BadMessage_Test::TestBody() at ??:? ...
,
Mar 8 2018
It appears that this is a race-condition between delivery of a Node update message to the peer, and the peer tearing down the underlying platform channel. I am able to repro a failure in the IPCChannelBadMessageTest.BadMessage test under e.g. Linux by introducing an artificial delay (see https://chromium-review.googlesource.com/c/chromium/src/+/954443), sufficient to allow the child process to have fully terminated, and closed the native IPC channel, before we try to call on to InitiateproxyRemoval(), which results in the Node update being sent.
,
Mar 8 2018
Running IPCMojoBootstrapTest.ReceiveEmptyMessage with SetMinLogLevel(-2), and the repro patch from #14 applied, I get: [175678:175678:0308/112356.146953:2503994187262:VERBOSE2:node.cc(786)] Created port 5CDC66076F901318.6B946D787CEDD930@DA638EC8A6563EE9.2F27A2D8F98FAB00 [175678:175678:0308/112356.147048:2503994187345:VERBOSE2:node.cc(786)] Created port E877BA60F7B70C94.C976F798C584191@DA638EC8A6563EE9.2F27A2D8F98FAB00 [175678:175678:0308/112356.147097:2503994187394:VERBOSE2:node_controller.cc(721)] Ignoring status change for 5CDC66076F901318.6B946D787CEDD930 because it doesn't have an observer. [175678:175678:0308/112356.147140:2503994187438:VERBOSE2:node_controller.cc(721)] Ignoring status change for E877BA60F7B70C94.C976F798C584191 because it doesn't have an observer. [175678:175678:0308/112356.147184:2503994187481:VERBOSE2:message_pipe_dispatcher.cc(96)] Creating new MessagePipeDispatcher for port 5CDC66076F901318.6B946D787CEDD930 [pipe_id=9187201950435737471; endpoint=0] [175678:175678:0308/112356.148592:2503994188894:VERBOSE2:node_controller.cc(721)] Ignoring status change for E877BA60F7B70C94.C976F798C584191 because it doesn't have an observer. [175678:175679:0308/112356.170809:2503994211117:VERBOSE1:node_controller.cc(789)] Node DA638EC8A6563EE9.2F27A2D8F98FAB00 accepted invitee 74138D154D2AC45F.C47AB9B677782C4 [175678:175679:0308/112356.170943:2503994211246:VERBOSE2:node_controller.cc(497)] Accepting new peer 74138D154D2AC45F.C47AB9B677782C4 on node DA638EC8A6563EE9.2F27A2D8F98FAB00 [175678:175679:0308/112356.171249:2503994211556:VERBOSE2:node_controller.cc(989)] Node DA638EC8A6563EE9.2F27A2D8F98FAB00 received RequestPortMerge for name test_pipe and port 5BEB17DD2D5E3109.35E5C9433F859D1C@74138D154D2AC45F.C47AB9B677782C4 [175678:175679:0308/112356.171393:2503994211694:VERBOSE1:node.cc(398)] Sending MergePort from E877BA60F7B70C94.C976F798C584191@DA638EC8A6563EE9.2F27A2D8F98FAB00 to 5BEB17DD2D5E3109.35E5C9433F859D1C@74138D154D2AC45F.C47AB9B677782C4 [175678:175679:0308/112356.171664:2503994211966:VERBOSE2:node.cc(540)] PortAccepted at E877BA60F7B70C94.C976F798C584191@DA638EC8A6563EE9.2F27A2D8F98FAB00 pointing to 647F1FDDEB862E84.D124A1FCF4A5375B@74138D154D2AC45F.C47AB9B677782C4 ^---- up to this point is identical to the trace without the patch (below). [175678:175679:0308/112357.172290:2503995212600:VERBOSE1:node_controller.cc(527)] Dropped peer 74138D154D2AC45F.C47AB9B677782C4 [175678:175679:0308/112357.172446:2503995212750:VERBOSE1:node.cc(435)] Observing lost connection from node DA638EC8A6563EE9.2F27A2D8F98FAB00 to node 74138D154D2AC45F.C47AB9B677782C4 [175678:175679:0308/112357.172577:2503995212880:VERBOSE2:node.cc(809)] Deleted port E877BA60F7B70C94.C976F798C584191@DA638EC8A6563EE9.2F27A2D8F98FAB00 [175678:175679:0308/112357.172650:2503995212952:VERBOSE2:node.cc(1339)] Forcibly deleted port E877BA60F7B70C94.C976F798C584191@DA638EC8A6563EE9.2F27A2D8F98FAB00 [175678:175679:0308/112357.172772:2503995213075:VERBOSE2:message_pipe_dispatcher.cc(360)] Peer closure detected on message pipe 9187201950435737471 endpoint 0 [port=5CDC66076F901318.6B946D787CEDD930] whereas without the patch I see: [176083:176083:0308/112657.927113:2504175967420:VERBOSE2:node.cc(786)] Created port 6C541D14264C2297.4F79786702084FE1@AE1D8068B8EFCF22.18128A3E19699EEC [176083:176083:0308/112657.927201:2504175967498:VERBOSE2:node.cc(786)] Created port 1B4C89D5513F6C30.7B09BB60BD03A131@AE1D8068B8EFCF22.18128A3E19699EEC [176083:176083:0308/112657.927240:2504175967537:VERBOSE2:node_controller.cc(721)] Ignoring status change for 6C541D14264C2297.4F79786702084FE1 because it doesn't have an observer. [176083:176083:0308/112657.927281:2504175967577:VERBOSE2:node_controller.cc(721)] Ignoring status change for 1B4C89D5513F6C30.7B09BB60BD03A131 because it doesn't have an observer. [176083:176083:0308/112657.927319:2504175967615:VERBOSE2:message_pipe_dispatcher.cc(96)] Creating new MessagePipeDispatcher for port 6C541D14264C2297.4F79786702084FE1 [pipe_id=9187201950435737471; endpoint=0] [176083:176083:0308/112657.928904:2504175969214:VERBOSE2:node_controller.cc(721)] Ignoring status change for 1B4C89D5513F6C30.7B09BB60BD03A131 because it doesn't have an observer. [176083:176084:0308/112657.943205:2504175983512:VERBOSE1:node_controller.cc(789)] Node AE1D8068B8EFCF22.18128A3E19699EEC accepted invitee BAE5C86F7BEB9CE8.B9521AC353BD4DDD [176083:176084:0308/112657.943349:2504175983651:VERBOSE2:node_controller.cc(497)] Accepting new peer BAE5C86F7BEB9CE8.B9521AC353BD4DDD on node AE1D8068B8EFCF22.18128A3E19699EEC [176083:176084:0308/112657.943570:2504175983876:VERBOSE2:node_controller.cc(989)] Node AE1D8068B8EFCF22.18128A3E19699EEC received RequestPortMerge for name test_pipe and port 94B7C9E4A6988CCE.8ACFEF97F7611FDF@BAE5C86F7BEB9CE8.B9521AC353BD4DDD [176083:176084:0308/112657.943695:2504175983996:VERBOSE1:node.cc(398)] Sending MergePort from 1B4C89D5513F6C30.7B09BB60BD03A131@AE1D8068B8EFCF22.18128A3E19699EEC to 94B7C9E4A6988CCE.8ACFEF97F7611FDF@BAE5C86F7BEB9CE8.B9521AC353BD4DDD [176083:176084:0308/112657.944048:2504175984354:VERBOSE2:node.cc(540)] PortAccepted at 1B4C89D5513F6C30.7B09BB60BD03A131@AE1D8068B8EFCF22.18128A3E19699EEC pointing to BCDC4A893346D7A7.4A21D133EBDDA6CC@BAE5C86F7BEB9CE8.B9521AC353BD4DDD ^---- up to this point is identical to the trace with the patch (above). [176083:176084:0308/112657.944445:2504175984749:VERBOSE2:node.cc(574)] ObserveProxy at 6C541D14264C2297.4F79786702084FE1@AE1D8068B8EFCF22.18128A3E19699EEC, proxy at BCDC4A893346D7A7.4A21D133EBDDA6CC@BAE5C86F7BEB9CE8.B9521AC353BD4DDD pointing to F79C58FEDF15F13E.92CCFFC8BA033DDD@BAE5C86F7BEB9CE8.B9521AC353BD4DDD [176083:176084:0308/112657.944561:2504175984866:VERBOSE2:node.cc(574)] ObserveProxy at 1B4C89D5513F6C30.7B09BB60BD03A131@AE1D8068B8EFCF22.18128A3E19699EEC, proxy at BCDC4A893346D7A7.4A21D133EBDDA6CC@BAE5C86F7BEB9CE8.B9521AC353BD4DDD pointing to F79C58FEDF15F13E.92CCFFC8BA033DDD@BAE5C86F7BEB9CE8.B9521AC353BD4DDD [176083:176084:0308/112657.944649:2504175984949:VERBOSE2:node.cc(610)] Delaying ObserveProxyAck to BCDC4A893346D7A7.4A21D133EBDDA6CC@BAE5C86F7BEB9CE8.B9521AC353BD4DDD [176083:176084:0308/112657.944773:2504175985074:VERBOSE2:node.cc(574)] ObserveProxy at 6C541D14264C2297.4F79786702084FE1@AE1D8068B8EFCF22.18128A3E19699EEC, proxy at 94B7C9E4A6988CCE.8ACFEF97F7611FDF@BAE5C86F7BEB9CE8.B9521AC353BD4DDD pointing to 6C541D14264C2297.4F79786702084FE1@AE1D8068B8EFCF22.18128A3E19699EEC [176083:176084:0308/112657.944860:2504175985160:VERBOSE2:node.cc(574)] ObserveProxy at 1B4C89D5513F6C30.7B09BB60BD03A131@AE1D8068B8EFCF22.18128A3E19699EEC, proxy at 94B7C9E4A6988CCE.8ACFEF97F7611FDF@BAE5C86F7BEB9CE8.B9521AC353BD4DDD pointing to 6C541D14264C2297.4F79786702084FE1@AE1D8068B8EFCF22.18128A3E19699EEC [176083:176084:0308/112657.945009:2504175985310:VERBOSE2:node.cc(574)] ObserveProxy at 6C541D14264C2297.4F79786702084FE1@AE1D8068B8EFCF22.18128A3E19699EEC, proxy at 1B4C89D5513F6C30.7B09BB60BD03A131@AE1D8068B8EFCF22.18128A3E19699EEC pointing to BCDC4A893346D7A7.4A21D133EBDDA6CC@BAE5C86F7BEB9CE8.B9521AC353BD4DDD [176083:176084:0308/112657.945107:2504175985407:VERBOSE2:node.cc(596)] Forwarding ObserveProxyAck from 6C541D14264C2297.4F79786702084FE1@AE1D8068B8EFCF22.18128A3E19699EEC to 1B4C89D5513F6C30.7B09BB60BD03A131@AE1D8068B8EFCF22.18128A3E19699EEC [176083:176084:0308/112657.945182:2504175985482:VERBOSE2:node.cc(637)] ObserveProxyAck at 1B4C89D5513F6C30.7B09BB60BD03A131@AE1D8068B8EFCF22.18128A3E19699EEC (last_sequence_num=1) [176083:176084:0308/112657.945283:2504175985583:VERBOSE2:node.cc(809)] Deleted port 1B4C89D5513F6C30.7B09BB60BD03A131@AE1D8068B8EFCF22.18128A3E19699EEC [176083:176084:0308/112657.945489:2504175985793:VERBOSE2:node.cc(694)] ObserveClosure at 6C541D14264C2297.4F79786702084FE1@AE1D8068B8EFCF22.18128A3E19699EEC (state=1) pointing to BCDC4A893346D7A7.4A21D133EBDDA6CC@BAE5C86F7BEB9CE8.B9521AC353BD4DDD (last_sequence_num=3) [176083:176084:0308/112657.945608:2504175985909:VERBOSE2:node.cc(724)] Forwarding ObserveClosure from 6C541D14264C2297.4F79786702084FE1@AE1D8068B8EFCF22.18128A3E19699EEC to peer BCDC4A893346D7A7.4A21D133EBDDA6CC@BAE5C86F7BEB9CE8.B9521AC353BD4DDD (last_sequence_num=1) [176083:176084:0308/112657.945714:2504175986014:VERBOSE2:message_pipe_dispatcher.cc(360)] Peer closure detected on message pipe 9187201950435737471 endpoint 0 [port=6C541D14264C2297.4F79786702084FE1] [176083:176084:0308/112657.946991:2504175987295:VERBOSE1:node_controller.cc(527)] Dropped peer BAE5C86F7BEB9CE8.B9521AC353BD4DDD [176083:176084:0308/112657.947085:2504175987386:VERBOSE1:node.cc(435)] Observing lost connection from node AE1D8068B8EFCF22.18128A3E19699EEC to node BAE5C86F7BEB9CE8.B9521AC353BD4DDD [176083:176083:0308/112657.948776:2504175989083:VERBOSE2:message_pipe_dispatcher.cc(134)] Closing message pipe 9187201950435737471 endpoint 0 [port=6C541D14264C2297.4F79786702084FE1] [176083:176083:0308/112657.948890:2504175989192:VERBOSE2:node.cc(809)] Deleted port 6C541D14264C2297.4F79786702084FE1@AE1D8068B8EFCF22.18128A3E19699EEC [176083:176083:0308/112657.948960:2504175989261:VERBOSE2:node.cc(282)] Sending ObserveClosure from 6C541D14264C2297.4F79786702084FE1@AE1D8068B8EFCF22.18128A3E19699EEC to BCDC4A893346D7A7.4A21D133EBDDA6CC@BAE5C86F7BEB9CE8.B9521AC353BD4DDD [176083:176083:0308/112657.949050:2504175989351:VERBOSE1:node_controller.cc(629)] Dropping message for unknown peer: BAE5C86F7BEB9CE8.B9521AC353BD4DDD
,
Mar 8 2018
Running IPCChannelBadMessageTest.BadMessage with the repro patch: [177468:177468:0308/113218.435439:2504496475739:VERBOSE2:node.cc(786)] Created port E26DA1EA7C3DA8E9.7B963D5C7503563C@24AA90917416AFAB.116F15595B43FD50 [177468:177468:0308/113218.435519:2504496475816:VERBOSE2:node.cc(786)] Created port 27AC433286CE3F9C.6603A1213D410ACD@24AA90917416AFAB.116F15595B43FD50 [177468:177468:0308/113218.435562:2504496475868:VERBOSE2:node_controller.cc(721)] Ignoring status change for E26DA1EA7C3DA8E9.7B963D5C7503563C because it doesn't have an observer. [177468:177468:0308/113218.435611:2504496475908:VERBOSE2:node_controller.cc(721)] Ignoring status change for 27AC433286CE3F9C.6603A1213D410ACD because it doesn't have an observer. [177468:177468:0308/113218.435646:2504496475942:VERBOSE2:message_pipe_dispatcher.cc(96)] Creating new MessagePipeDispatcher for port E26DA1EA7C3DA8E9.7B963D5C7503563C [pipe_id=9187201950435737471; endpoint=0] [177468:177468:0308/113218.437060:2504496477374:VERBOSE2:node_controller.cc(721)] Ignoring status change for 27AC433286CE3F9C.6603A1213D410ACD because it doesn't have an observer. [177468:177468:0308/113218.437421:2504496477725:VERBOSE2:node_controller.cc(721)] Ignoring status change for 27AC433286CE3F9C.6603A1213D410ACD because it doesn't have an observer. [177468:177468:0308/113218.437621:2504496477924:VERBOSE2:node_controller.cc(721)] Ignoring status change for 27AC433286CE3F9C.6603A1213D410ACD because it doesn't have an observer. [177468:177469:0308/113218.465120:2504496505427:VERBOSE1:node_controller.cc(789)] Node 24AA90917416AFAB.116F15595B43FD50 accepted invitee 8266D204C4892860.724DBB3F25A0A42F [177468:177469:0308/113218.465261:2504496505563:VERBOSE2:node_controller.cc(497)] Accepting new peer 8266D204C4892860.724DBB3F25A0A42F on node 24AA90917416AFAB.116F15595B43FD50 [177468:177469:0308/113218.465551:2504496505858:VERBOSE2:node_controller.cc(989)] Node 24AA90917416AFAB.116F15595B43FD50 received RequestPortMerge for name test_pipe and port 1FCBF9AA65820BF8.C4DDB91D0B00A3C@8266D204C4892860.724DBB3F25A0A42F [177468:177469:0308/113218.465667:2504496505968:VERBOSE1:node.cc(398)] Sending MergePort from 27AC433286CE3F9C.6603A1213D410ACD@24AA90917416AFAB.116F15595B43FD50 to 1FCBF9AA65820BF8.C4DDB91D0B00A3C@8266D204C4892860.724DBB3F25A0A42F [177468:177469:0308/113218.466019:2504496506326:VERBOSE2:node.cc(540)] PortAccepted at 27AC433286CE3F9C.6603A1213D410ACD@24AA90917416AFAB.116F15595B43FD50 pointing to 8B9A723C8576DDCC.E2907E872E4F5761@8266D204C4892860.724DBB3F25A0A42F ^-- appears identical up to here. [177468:177469:0308/113219.466603:2504497506914:VERBOSE1:node_controller.cc(527)] Dropped peer 8266D204C4892860.724DBB3F25A0A42F [177468:177469:0308/113219.466772:2504497507075:VERBOSE1:node.cc(435)] Observing lost connection from node 24AA90917416AFAB.116F15595B43FD50 to node 8266D204C4892860.724DBB3F25A0A42F [177468:177469:0308/113219.466889:2504497507191:VERBOSE2:node.cc(809)] Deleted port 27AC433286CE3F9C.6603A1213D410ACD@24AA90917416AFAB.116F15595B43FD50 [177468:177469:0308/113219.466965:2504497507267:VERBOSE2:node.cc(1339)] Forcibly deleted port 27AC433286CE3F9C.6603A1213D410ACD@24AA90917416AFAB.116F15595B43FD50 [177468:177469:0308/113219.467091:2504497507404:VERBOSE2:message_pipe_dispatcher.cc(360)] Peer closure detected on message pipe 9187201950435737471 endpoint 0 [port=E26DA1EA7C3DA8E9.7B963D5C7503563C] [177468:177468:0308/113219.467952:2504497508260:FATAL:ipc_channel_proxy_unittest.cc(75)] Check failed: quit_message_received_. versus without: [177257:177257:0308/113117.312615:2504435352914:VERBOSE2:node.cc(786)] Created port 7EE2166142A147AE.ABE50F4FB87870DC@F4EB2300055074AC.C8A6EFC63ACE2BA2 [177257:177257:0308/113117.312697:2504435352995:VERBOSE2:node.cc(786)] Created port 884D339BF13E5F9C.F0C2E2093E24FA8E@F4EB2300055074AC.C8A6EFC63ACE2BA2 [177257:177257:0308/113117.312746:2504435353052:VERBOSE2:node_controller.cc(721)] Ignoring status change for 7EE2166142A147AE.ABE50F4FB87870DC because it doesn't have an observer. [177257:177257:0308/113117.312798:2504435353095:VERBOSE2:node_controller.cc(721)] Ignoring status change for 884D339BF13E5F9C.F0C2E2093E24FA8E because it doesn't have an observer. [177257:177257:0308/113117.312839:2504435353136:VERBOSE2:message_pipe_dispatcher.cc(96)] Creating new MessagePipeDispatcher for port 7EE2166142A147AE.ABE50F4FB87870DC [pipe_id=9187201950435737471; endpoint=0] [177257:177257:0308/113117.314096:2504435354398:VERBOSE2:node_controller.cc(721)] Ignoring status change for 884D339BF13E5F9C.F0C2E2093E24FA8E because it doesn't have an observer. [177257:177257:0308/113117.314292:2504435354590:VERBOSE2:node_controller.cc(721)] Ignoring status change for 884D339BF13E5F9C.F0C2E2093E24FA8E because it doesn't have an observer. [177257:177257:0308/113117.314406:2504435354703:VERBOSE2:node_controller.cc(721)] Ignoring status change for 884D339BF13E5F9C.F0C2E2093E24FA8E because it doesn't have an observer. [177257:177258:0308/113117.339097:2504435379405:VERBOSE1:node_controller.cc(789)] Node F4EB2300055074AC.C8A6EFC63ACE2BA2 accepted invitee C0392218FC033CAC.3372B36A6CBB2A21 [177257:177258:0308/113117.339251:2504435379553:VERBOSE2:node_controller.cc(497)] Accepting new peer C0392218FC033CAC.3372B36A6CBB2A21 on node F4EB2300055074AC.C8A6EFC63ACE2BA2 [177257:177258:0308/113117.339539:2504435379846:VERBOSE2:node_controller.cc(989)] Node F4EB2300055074AC.C8A6EFC63ACE2BA2 received RequestPortMerge for name test_pipe and port 77290FBC5C0496.B59621FE5E84D775@C0392218FC033CAC.3372B36A6CBB2A21 [177257:177258:0308/113117.339656:2504435379957:VERBOSE1:node.cc(398)] Sending MergePort from 884D339BF13E5F9C.F0C2E2093E24FA8E@F4EB2300055074AC.C8A6EFC63ACE2BA2 to 77290FBC5C0496.B59621FE5E84D775@C0392218FC033CAC.3372B36A6CBB2A21 [177257:177258:0308/113117.339974:2504435380275:VERBOSE2:node.cc(540)] PortAccepted at 884D339BF13E5F9C.F0C2E2093E24FA8E@F4EB2300055074AC.C8A6EFC63ACE2BA2 pointing to EE374D430B7BCB7E.FC60FCC822D5C890@C0392218FC033CAC.3372B36A6CBB2A21 ^-- appears identical up to here. [177257:177258:0308/113117.340342:2504435380644:VERBOSE2:node.cc(574)] ObserveProxy at 7EE2166142A147AE.ABE50F4FB87870DC@F4EB2300055074AC.C8A6EFC63ACE2BA2, proxy at EE374D430B7BCB7E.FC60FCC822D5C890@C0392218FC033CAC.3372B36A6CBB2A21 pointing to 577DB812978E3413.25621097804E2E7C@C0392218FC033CAC.3372B36A6CBB2A21 [177257:177258:0308/113117.340467:2504435380773:VERBOSE2:node.cc(574)] ObserveProxy at 884D339BF13E5F9C.F0C2E2093E24FA8E@F4EB2300055074AC.C8A6EFC63ACE2BA2, proxy at EE374D430B7BCB7E.FC60FCC822D5C890@C0392218FC033CAC.3372B36A6CBB2A21 pointing to 577DB812978E3413.25621097804E2E7C@C0392218FC033CAC.3372B36A6CBB2A21 [177257:177258:0308/113117.340576:2504435380877:VERBOSE2:node.cc(610)] Delaying ObserveProxyAck to EE374D430B7BCB7E.FC60FCC822D5C890@C0392218FC033CAC.3372B36A6CBB2A21 [177257:177258:0308/113117.340681:2504435380982:VERBOSE2:node.cc(574)] ObserveProxy at 7EE2166142A147AE.ABE50F4FB87870DC@F4EB2300055074AC.C8A6EFC63ACE2BA2, proxy at 77290FBC5C0496.B59621FE5E84D775@C0392218FC033CAC.3372B36A6CBB2A21 pointing to 7EE2166142A147AE.ABE50F4FB87870DC@F4EB2300055074AC.C8A6EFC63ACE2BA2 [177257:177258:0308/113117.340764:2504435381064:VERBOSE2:node.cc(574)] ObserveProxy at 884D339BF13E5F9C.F0C2E2093E24FA8E@F4EB2300055074AC.C8A6EFC63ACE2BA2, proxy at 77290FBC5C0496.B59621FE5E84D775@C0392218FC033CAC.3372B36A6CBB2A21 pointing to 7EE2166142A147AE.ABE50F4FB87870DC@F4EB2300055074AC.C8A6EFC63ACE2BA2 [177257:177258:0308/113117.340925:2504435381226:VERBOSE2:node.cc(574)] ObserveProxy at 7EE2166142A147AE.ABE50F4FB87870DC@F4EB2300055074AC.C8A6EFC63ACE2BA2, proxy at 884D339BF13E5F9C.F0C2E2093E24FA8E@F4EB2300055074AC.C8A6EFC63ACE2BA2 pointing to EE374D430B7BCB7E.FC60FCC822D5C890@C0392218FC033CAC.3372B36A6CBB2A21 [177257:177258:0308/113117.341038:2504435381339:VERBOSE2:node.cc(596)] Forwarding ObserveProxyAck from 7EE2166142A147AE.ABE50F4FB87870DC@F4EB2300055074AC.C8A6EFC63ACE2BA2 to 884D339BF13E5F9C.F0C2E2093E24FA8E@F4EB2300055074AC.C8A6EFC63ACE2BA2 [177257:177258:0308/113117.341115:2504435381415:VERBOSE2:node.cc(637)] ObserveProxyAck at 884D339BF13E5F9C.F0C2E2093E24FA8E@F4EB2300055074AC.C8A6EFC63ACE2BA2 (last_sequence_num=3) [177257:177258:0308/113117.341207:2504435381508:VERBOSE2:node.cc(809)] Deleted port 884D339BF13E5F9C.F0C2E2093E24FA8E@F4EB2300055074AC.C8A6EFC63ACE2BA2 [177257:177258:0308/113117.341532:2504435381833:VERBOSE2:node.cc(574)] ObserveProxy at 7EE2166142A147AE.ABE50F4FB87870DC@F4EB2300055074AC.C8A6EFC63ACE2BA2, proxy at EE374D430B7BCB7E.FC60FCC822D5C890@C0392218FC033CAC.3372B36A6CBB2A21 pointing to 577DB812978E3413.25621097804E2E7C@C0392218FC033CAC.3372B36A6CBB2A21 [177257:177258:0308/113117.341646:2504435381947:VERBOSE2:node.cc(596)] Forwarding ObserveProxyAck from 7EE2166142A147AE.ABE50F4FB87870DC@F4EB2300055074AC.C8A6EFC63ACE2BA2 to EE374D430B7BCB7E.FC60FCC822D5C890@C0392218FC033CAC.3372B36A6CBB2A21 [177257:177258:0308/113117.341804:2504435382105:VERBOSE2:node.cc(694)] ObserveClosure at 7EE2166142A147AE.ABE50F4FB87870DC@F4EB2300055074AC.C8A6EFC63ACE2BA2 (state=1) pointing to 577DB812978E3413.25621097804E2E7C@C0392218FC033CAC.3372B36A6CBB2A21 (last_sequence_num=5) [177257:177258:0308/113117.341915:2504435382216:VERBOSE2:node.cc(724)] Forwarding ObserveClosure from 7EE2166142A147AE.ABE50F4FB87870DC@F4EB2300055074AC.C8A6EFC63ACE2BA2 to peer 577DB812978E3413.25621097804E2E7C@C0392218FC033CAC.3372B36A6CBB2A21 (last_sequence_num=3) [177257:177258:0308/113117.342047:2504435382348:VERBOSE2:message_pipe_dispatcher.cc(360)] Peer closure detected on message pipe 9187201950435737471 endpoint 0 [port=7EE2166142A147AE.ABE50F4FB87870DC] [177257:177258:0308/113117.345928:2504435386232:VERBOSE1:node_controller.cc(527)] Dropped peer C0392218FC033CAC.3372B36A6CBB2A21 [177257:177258:0308/113117.346018:2504435386319:VERBOSE1:node.cc(435)] Observing lost connection from node F4EB2300055074AC.C8A6EFC63ACE2BA2 to node C0392218FC033CAC.3372B36A6CBB2A21 [177257:177257:0308/113117.346401:2504435386708:VERBOSE2:message_pipe_dispatcher.cc(134)] Closing message pipe 9187201950435737471 endpoint 0 [port=7EE2166142A147AE.ABE50F4FB87870DC] [177257:177257:0308/113117.346544:2504435386846:VERBOSE2:node.cc(809)] Deleted port 7EE2166142A147AE.ABE50F4FB87870DC@F4EB2300055074AC.C8A6EFC63ACE2BA2 [177257:177257:0308/113117.346614:2504435386915:VERBOSE2:node.cc(282)] Sending ObserveClosure from 7EE2166142A147AE.ABE50F4FB87870DC@F4EB2300055074AC.C8A6EFC63ACE2BA2 to 577DB812978E3413.25621097804E2E7C@C0392218FC033CAC.3372B36A6CBB2A21 [177257:177257:0308/113117.346731:2504435387032:VERBOSE1:node_controller.cc(629)] Dropping message for unknown peer: C0392218FC033CAC.3372B36A6CBB2A21
,
Mar 8 2018
Reducing the repro sleep to 200ms, I get IPCMojoBootstrapTest.ReceiveEmptyMessage runs that look identical up to the sleep: 179949:179949:0308/114407.814065:2505205854364:VERBOSE2:node.cc(786)] Created port CE38CF2893AC91A9.B7B984E6370CA228@B6E4120287BBD1E3.882349BCB43F2A41 [179949:179949:0308/114407.814144:2505205854447:VERBOSE2:node.cc(786)] Created port 8302860777A211D8.4CC287C61808DF60@B6E4120287BBD1E3.882349BCB43F2A41 [179949:179949:0308/114407.814187:2505205854483:VERBOSE2:node_controller.cc(721)] Ignoring status change for CE38CF2893AC91A9.B7B984E6370CA228 because it doesn't have an observer. [179949:179949:0308/114407.814241:2505205854537:VERBOSE2:node_controller.cc(721)] Ignoring status change for 8302860777A211D8.4CC287C61808DF60 because it doesn't have an observer. [179949:179949:0308/114407.814276:2505205854572:VERBOSE2:message_pipe_dispatcher.cc(96)] Creating new MessagePipeDispatcher for port CE38CF2893AC91A9.B7B984E6370CA228 [pipe_id=9187201950435737471; endpoint=0] [179949:179949:0308/114407.816000:2505205856309:VERBOSE2:node_controller.cc(721)] Ignoring status change for 8302860777A211D8.4CC287C61808DF60 because it doesn't have an observer. [179951:179952:0308/114407.846083:2505205886390:VERBOSE1:node_controller.cc(754)] Broker client A3C11982ADEE03AE.AFEC689BC1602D64 accepting invitation from B6E4120287BBD1E3.882349BCB43F2A41 [179949:179950:0308/114407.846164:2505205886474:VERBOSE1:node_controller.cc(789)] Node B6E4120287BBD1E3.882349BCB43F2A41 accepted invitee A3C11982ADEE03AE.AFEC689BC1602D64 [179949:179950:0308/114407.846331:2505205886637:VERBOSE2:node_controller.cc(497)] Accepting new peer A3C11982ADEE03AE.AFEC689BC1602D64 on node B6E4120287BBD1E3.882349BCB43F2A41 [179951:179952:0308/114407.846542:2505205886852:VERBOSE2:node_controller.cc(497)] Accepting new peer B6E4120287BBD1E3.882349BCB43F2A41 on node A3C11982ADEE03AE.AFEC689BC1602D64 [179951:179951:0308/114407.846664:2505205886969:VERBOSE2:node_controller.cc(721)] Ignoring status change for 1E168D8E76A3FA03.DC139092D1D6CDF6 because it doesn't have an observer. [179951:179952:0308/114407.846690:2505205886992:VERBOSE1:node_controller.cc(963)] Client A3C11982ADEE03AE.AFEC689BC1602D64 accepted by broker B6E4120287BBD1E3.882349BCB43F2A41 [179949:179950:0308/114407.846728:2505205887035:VERBOSE2:node_controller.cc(989)] Node B6E4120287BBD1E3.882349BCB43F2A41 received RequestPortMerge for name test_pipe and port 1E168D8E76A3FA03.DC139092D1D6CDF6@A3C11982ADEE03AE.AFEC689BC1602D64 [179951:179951:0308/114407.846892:2505205887196:VERBOSE2:node_controller.cc(721)] Ignoring status change for 1E168D8E76A3FA03.DC139092D1D6CDF6 because it doesn't have an observer. [179949:179950:0308/114407.846892:2505205887200:VERBOSE1:node.cc(398)] Sending MergePort from 8302860777A211D8.4CC287C61808DF60@B6E4120287BBD1E3.882349BCB43F2A41 to 1E168D8E76A3FA03.DC139092D1D6CDF6@A3C11982ADEE03AE.AFEC689BC1602D64 [179951:179952:0308/114407.847176:2505205887480:VERBOSE1:node.cc(749)] MergePort at 1E168D8E76A3FA03.DC139092D1D6CDF6@A3C11982ADEE03AE.AFEC689BC1602D64 merging with proxy FA8ACC5E58E74636.BAD40D0925F1BA97@A3C11982ADEE03AE.AFEC689BC1602D64 pointing to CE38CF2893AC91A9.B7B984E6370CA228@B6E4120287BBD1E3.882349BCB43F2A41 referred by 8302860777A211D8.4CC287C61808DF60@B6E4120287BBD1E3.882349BCB43F2A41 [179951:179952:0308/114407.847323:2505205887640:VERBOSE2:node.cc(1006)] Accepting port FA8ACC5E58E74636.BAD40D0925F1BA97 [peer_closed=0; last_sequence_num_to_receive=0] [179951:179952:0308/114407.847442:2505205887744:VERBOSE2:node.cc(786)] Created port FA8ACC5E58E74636.BAD40D0925F1BA97@A3C11982ADEE03AE.AFEC689BC1602D64 [179951:179952:0308/114407.847639:2505205887946:VERBOSE2:node.cc(574)] ObserveProxy at 5998C8A4C8A2204D.BD4E5B002547B3BB@A3C11982ADEE03AE.AFEC689BC1602D64, proxy at FA8ACC5E58E74636.BAD40D0925F1BA97@A3C11982ADEE03AE.AFEC689BC1602D64 pointing to 5998C8A4C8A2204D.BD4E5B002547B3BB@A3C11982ADEE03AE.AFEC689BC1602D64 [179949:179950:0308/114407.847650:2505205887951:VERBOSE2:node.cc(540)] PortAccepted at 8302860777A211D8.4CC287C61808DF60@B6E4120287BBD1E3.882349BCB43F2A41 pointing to FA8ACC5E58E74636.BAD40D0925F1BA97@A3C11982ADEE03AE.AFEC689BC1602D64 and then either succeed: [179951:179952:0308/114407.847774:2505205888076:VERBOSE2:node.cc(574)] ObserveProxy at 1E168D8E76A3FA03.DC139092D1D6CDF6@A3C11982ADEE03AE.AFEC689BC1602D64, proxy at FA8ACC5E58E74636.BAD40D0925F1BA97@A3C11982ADEE03AE.AFEC689BC1602D64 pointing to 5998C8A4C8A2204D.BD4E5B002547B3BB@A3C11982ADEE03AE.AFEC689BC1602D64 [179949:179950:0308/114407.847829:2505205888131:ERROR:node.cc(1196)] SLEEP [179951:179951:0308/114407.848524:2505205888825:VERBOSE2:message_pipe_dispatcher.cc(134)] Closing message pipe 9187201950435737471 endpoint 1 [port=5998C8A4C8A2204D.BD4E5B002547B3BB] [179951:179951:0308/114407.848613:2505205888912:VERBOSE2:node.cc(809)] Deleted port 5998C8A4C8A2204D.BD4E5B002547B3BB@A3C11982ADEE03AE.AFEC689BC1602D64 [179951:179951:0308/114407.848672:2505205888970:VERBOSE2:node.cc(282)] Sending ObserveClosure from 5998C8A4C8A2204D.BD4E5B002547B3BB@A3C11982ADEE03AE.AFEC689BC1602D64 to 1E168D8E76A3FA03.DC139092D1D6CDF6@A3C11982ADEE03AE.AFEC689BC1602D64 [179951:179951:0308/114407.848735:2505205889034:VERBOSE2:node.cc(694)] ObserveClosure at 1E168D8E76A3FA03.DC139092D1D6CDF6@A3C11982ADEE03AE.AFEC689BC1602D64 (state=3) pointing to CE38CF2893AC91A9.B7B984E6370CA228@B6E4120287BBD1E3.882349BCB43F2A41 (last_sequence_num=3) [179951:179951:0308/114407.848796:2505205889094:VERBOSE2:node.cc(724)] Forwarding ObserveClosure from 1E168D8E76A3FA03.DC139092D1D6CDF6@A3C11982ADEE03AE.AFEC689BC1602D64 to peer CE38CF2893AC91A9.B7B984E6370CA228@B6E4120287BBD1E3.882349BCB43F2A41 (last_sequence_num=3) [179951:179951:0308/114407.848858:2505205889157:VERBOSE2:node.cc(809)] Deleted port 1E168D8E76A3FA03.DC139092D1D6CDF6@A3C11982ADEE03AE.AFEC689BC1602D64 [179949:179950:0308/114408.048471:2505206088783:VERBOSE1:node_controller.cc(527)] Dropped peer A3C11982ADEE03AE.AFEC689BC1602D64 [179949:179950:0308/114408.048622:2505206088929:VERBOSE1:node.cc(435)] Observing lost connection from node B6E4120287BBD1E3.882349BCB43F2A41 to node A3C11982ADEE03AE.AFEC689BC1602D64 [179949:179950:0308/114408.048759:2505206089062:VERBOSE2:node.cc(809)] Deleted port 8302860777A211D8.4CC287C61808DF60@B6E4120287BBD1E3.882349BCB43F2A41 [179949:179950:0308/114408.048831:2505206089133:VERBOSE2:node.cc(1340)] Forcibly deleted port 8302860777A211D8.4CC287C61808DF60@B6E4120287BBD1E3.882349BCB43F2A41 [179949:179950:0308/114408.048975:2505206089277:VERBOSE2:message_pipe_dispatcher.cc(360)] Peer closure detected on message pipe 9187201950435737471 endpoint 0 [port=CE38CF2893AC91A9.B7B984E6370CA228] [179949:179949:0308/114408.049673:2505206089977:VERBOSE2:message_pipe_dispatcher.cc(134)] Closing message pipe 9187201950435737471 endpoint 0 [port=CE38CF2893AC91A9.B7B984E6370CA228] [179949:179949:0308/114408.049823:2505206090125:VERBOSE2:node.cc(809)] Deleted port CE38CF2893AC91A9.B7B984E6370CA228@B6E4120287BBD1E3.882349BCB43F2A41 [179949:179949:0308/114408.049891:2505206090191:VERBOSE2:node.cc(282)] Sending ObserveClosure from CE38CF2893AC91A9.B7B984E6370CA228@B6E4120287BBD1E3.882349BCB43F2A41 to 8302860777A211D8.4CC287C61808DF60@B6E4120287BBD1E3.882349BCB43F2A41 or fail: [179953:179954:0308/114410.503341:2505208543643:ERROR:node.cc(1196)] SLEEP [179955:179956:0308/114410.503313:2505208543615:VERBOSE2:node.cc(574)] ObserveProxy at 5B8EC058D8B5B1E2.C1BF3238901CB1C4@7BE30C0747773B49.5A03F6C01E2DC0B, proxy at 9554886FA5839852.F4CE2FBE8FDBD6B1@7BE30C0747773B49.5A03F6C01E2DC0B pointing to C2169F90BCFD8AE0.FD366184862FD931@7BE30C0747773B49.5A03F6C01E2DC0B [179955:179955:0308/114410.504347:2505208544652:VERBOSE2:message_pipe_dispatcher.cc(134)] Closing message pipe 9187201950435737471 endpoint 1 [port=C2169F90BCFD8AE0.FD366184862FD931] [179955:179955:0308/114410.504462:2505208544764:VERBOSE2:node.cc(809)] Deleted port C2169F90BCFD8AE0.FD366184862FD931@7BE30C0747773B49.5A03F6C01E2DC0B [179955:179955:0308/114410.504539:2505208544839:VERBOSE2:node.cc(282)] Sending ObserveClosure from C2169F90BCFD8AE0.FD366184862FD931@7BE30C0747773B49.5A03F6C01E2DC0B to 5B8EC058D8B5B1E2.C1BF3238901CB1C4@7BE30C0747773B49.5A03F6C01E2DC0B [179955:179955:0308/114410.504619:2505208544920:VERBOSE2:node.cc(694)] ObserveClosure at 5B8EC058D8B5B1E2.C1BF3238901CB1C4@7BE30C0747773B49.5A03F6C01E2DC0B (state=3) pointing to 6C726A7CCD7F17D4.C4D4D72A373B92E0@C5DEB589D7D41E6.997A05BAB1EF094F (last_sequence_num=3) [179955:179955:0308/114410.504694:2505208544994:VERBOSE2:node.cc(724)] Forwarding ObserveClosure from 5B8EC058D8B5B1E2.C1BF3238901CB1C4@7BE30C0747773B49.5A03F6C01E2DC0B to peer 6C726A7CCD7F17D4.C4D4D72A373B92E0@C5DEB589D7D41E6.997A05BAB1EF094F (last_sequence_num=3) [179955:179955:0308/114410.504773:2505208545073:VERBOSE2:node.cc(809)] Deleted port 5B8EC058D8B5B1E2.C1BF3238901CB1C4@7BE30C0747773B49.5A03F6C01E2DC0B [179953:179954:0308/114410.703785:2505208744094:VERBOSE1:node_controller.cc(527)] Dropped peer 7BE30C0747773B49.5A03F6C01E2DC0B [179953:179954:0308/114410.703959:2505208744264:VERBOSE1:node.cc(435)] Observing lost connection from node C5DEB589D7D41E6.997A05BAB1EF094F to node 7BE30C0747773B49.5A03F6C01E2DC0B [179953:179954:0308/114410.704079:2505208744380:VERBOSE2:node.cc(809)] Deleted port 559D13F18E5057BD.40A72522534FED27@C5DEB589D7D41E6.997A05BAB1EF094F [179953:179954:0308/114410.704169:2505208744471:VERBOSE2:node.cc(1340)] Forcibly deleted port 559D13F18E5057BD.40A72522534FED27@C5DEB589D7D41E6.997A05BAB1EF094F [179953:179954:0308/114410.704286:2505208744587:VERBOSE2:message_pipe_dispatcher.cc(360)] Peer closure detected on message pipe 9187201950435737471 endpoint 0 [port=6C726A7CCD7F17D4.C4D4D72A373B92E0]
,
Mar 9 2018
More log output Now With Added IPC:Channel::Send()! ... [201790:201791:0308/133114.247796:2511632288097:ERROR:node.cc(1196)] SLEEP [201792:201792:0308/133114.248261:2511632288572:VERBOSE2:ipc_channel_mojo.cc(220)] sending message @0x18cd9428c660 on channel @0x18cd9429eb60 with type 327717 [201792:201792:0308/133114.248621:2511632288925:VERBOSE2:ipc_channel_mojo.cc(220)] sending message @0x18cd942b6020 on channel @0x18cd9429eb60 with type 393262 [201792:201792:0308/133114.249510:2511632289814:VERBOSE2:message_pipe_dispatcher.cc(134)] Closing message pipe 9187201950435737471 endpoint 1 [port=28D6C7FE28A16C6A.505E99342AA2211D] [201792:201792:0308/133114.249666:2511632289968:VERBOSE2:node.cc(809)] Deleted port 28D6C7FE28A16C6A.505E99342AA2211D@B51677E3BC9F0D30.24B7399638C04E69 [201792:201792:0308/133114.249738:2511632290040:VERBOSE2:node.cc(282)] Sending ObserveClosure from 28D6C7FE28A16C6A.505E99342AA2211D@B51677E3BC9F0D30.24B7399638C04E69 to 5F152440B36B8249.BC06E816C5AB9BA@B51677E3BC9F0D30.24B7399638C04E69 [201792:201792:0308/133114.249828:2511632290129:VERBOSE2:node.cc(694)] ObserveClosure at 5F152440B36B8249.BC06E816C5AB9BA@B51677E3BC9F0D30.24B7399638C04E69 (state=3) pointing to A5E76F0305FA2ADF.862147BE4EA48F5A@916BCEDC7F635E74.29085D65AD5FB54D (last_sequence_num=5) [201792:201792:0308/133114.249913:2511632290214:VERBOSE2:node.cc(724)] Forwarding ObserveClosure from 5F152440B36B8249.BC06E816C5AB9BA@B51677E3BC9F0D30.24B7399638C04E69 to peer A5E76F0305FA2ADF.862147BE4EA48F5A@916BCEDC7F635E74.29085D65AD5FB54D (last_sequence_num=5) [201792:201792:0308/133114.250001:2511632290302:VERBOSE2:node.cc(809)] Deleted port 5F152440B36B8249.BC06E816C5AB9BA@B51677E3BC9F0D30.24B7399638C04E69 [201790:201791:0308/133114.448320:2511632488628:VERBOSE1:node_controller.cc(527)] Dropped peer B51677E3BC9F0D30.24B7399638C04E69 It looks like the receive side of the IPC::ChannelMojo has become connected, enabling the client to receive and process incoming Chrome IPCs (TestMsg_SendBadMessage and then WorkerMsg_Quit). Each message provokes a Chrome IPC to be sent back to the parent, but those are queued rather than being sent immediately - the send side of the channel isn't initialized yet? - then the process exits, closing the underlying platform channel. The test(s) fail because the expected messages were never actually sent by the client. As you'd expect, introducing a delay in the e.g. the ChannelProxyClient before Close() (in addition to the repro-Sleep in mojo::edk::Node) causes the IPCChannelBadMessageTest to start passing again. The failures in IPCMojoBootstrapTest are harder to repro, even with the Sleep() introduced in Node, but this turns out to be because the tests expecting a message from the child process don't actually verify that one was received.
,
Mar 9 2018
OK, so this problem occurs when: 1. Sending process writes messages, then exits. 2. Receiving process tries to write to the channel, from its IO thread - in the IPCChannelBadMessageTest case, this is happening within a channel-readable callback. 3. Write operation fails, because the channel is closed, so we PostTask(OnError) back to the I/O thread. 4. When the IO thread returns to the loop, it processes the posted task ahead of channel-readable notifications. 5. We invoke OnError while data is still in the channel, waiting to be read. I've drafted a CL at https://chromium-review.googlesource.com/c/chromium/src/+/956932 which defers invoking Channel::OnError(), and thereby notifying the delegate, in case of apparent disconnection, until we observe end-of-stream.
,
Mar 9 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/c7fcdeaa38cc77b31d40b19f026c2af1b433855c commit c7fcdeaa38cc77b31d40b19f026c2af1b433855c Author: Wez <wez@chromium.org> Date: Fri Mar 09 06:20:00 2018 Reintroduce verbose logging in IPC::Channel::Send(). Platform-specific IPC::Channel implementations included Chrome IPC message logging, but this was not carried into the Mojo-based Channel implementation. Chrome IPC message logging is higher-level than Mojo IPC message logs, which can be useful for certain kinds of diagnosis. Bug: 816620 Change-Id: I467857fd07b66c3d468039d6fb7c2950f931c8bb Reviewed-on: https://chromium-review.googlesource.com/956347 Reviewed-by: Ken Rockot <rockot@chromium.org> Commit-Queue: Wez <wez@chromium.org> Cr-Commit-Position: refs/heads/master@{#542050} [modify] https://crrev.com/c7fcdeaa38cc77b31d40b19f026c2af1b433855c/ipc/ipc_channel_mojo.cc
,
Mar 10 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/cdcb02624c340873b4d7d6e2564666e60f3aa856 commit cdcb02624c340873b4d7d6e2564666e60f3aa856 Author: Wez <wez@chromium.org> Date: Sat Mar 10 00:49:30 2018 Add missing message-received expectation to IPCMojoBootstrapTest. Some of these tests expected a message from the child process, and would verify that it was valid/invalid as expected, but would still pass if no message were received at all. Bug: 816620 Change-Id: Ie8acf85086f4416c7023118673c40d00a2e190d7 Reviewed-on: https://chromium-review.googlesource.com/955554 Reviewed-by: Ken Rockot <rockot@chromium.org> Commit-Queue: Wez <wez@chromium.org> Cr-Commit-Position: refs/heads/master@{#542296} [modify] https://crrev.com/cdcb02624c340873b4d7d6e2564666e60f3aa856/ipc/ipc_mojo_bootstrap_unittest.cc
,
Mar 12 2018
SharedBufferTest.CreateAndPassReadOnlyBuffer failed in https://build.chromium.org/p/chromium.fyi/builders/Fuchsia%20%28dbg%29/builds/17032 with a Mojo precondition error from wait.
,
Mar 12 2018
The IPCMojoBootstrapTest.ReceiveEmptyMessage has been failing flakily on the Android KitKat Phone Tester (dbg): https://test-results.appspot.com/dashboards/flakiness_dashboard.html#testType=ipc_tests&tests=IPCMojoBootstrapTest.ReceiveEmptyMessage The failure is I 49.118s run_tests_on_device(06b9f8df00622ff4) [ RUN ] IPCMojoBootstrapTest.ReceiveEmptyMessage I 49.118s run_tests_on_device(06b9f8df00622ff4) ../../ipc/ipc_mojo_bootstrap_unittest.cc:71: Failure I 49.118s run_tests_on_device(06b9f8df00622ff4) Expected equality of these values: I 49.118s run_tests_on_device(06b9f8df00622ff4) expected_message I 49.119s run_tests_on_device(06b9f8df00622ff4) Which is: true I 49.119s run_tests_on_device(06b9f8df00622ff4) received_message_ I 49.119s run_tests_on_device(06b9f8df00622ff4) Which is: false I 49.119s run_tests_on_device(06b9f8df00622ff4) [ FAILED ] IPCMojoBootstrapTest.ReceiveEmptyMessage (79 ms) will that be fixed in this bug?
,
Mar 12 2018
Re #23: Yes, I believe so. I added that expectation in https://chromium-review.googlesource.com/955554 - I'll revert that, to re-land after the fix is in-place.
,
Mar 12 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/6d1fd84a5ea8e6f47a83cfd5cac5f39221836370 commit 6d1fd84a5ea8e6f47a83cfd5cac5f39221836370 Author: Wez <wez@chromium.org> Date: Mon Mar 12 17:29:56 2018 Revert "Add missing message-received expectation to IPCMojoBootstrapTest." This reverts commit cdcb02624c340873b4d7d6e2564666e60f3aa856. Reason for revert: We believe that this causes the tests to flake on various platforms, due to issue 816620 . We'll re-land it when the mojo::edk::Channel fix has landed. Original change's description: > Add missing message-received expectation to IPCMojoBootstrapTest. > > Some of these tests expected a message from the child process, and would > verify that it was valid/invalid as expected, but would still pass if > no message were received at all. > > Bug: 816620 > Change-Id: Ie8acf85086f4416c7023118673c40d00a2e190d7 > Reviewed-on: https://chromium-review.googlesource.com/955554 > Reviewed-by: Ken Rockot <rockot@chromium.org> > Commit-Queue: Wez <wez@chromium.org> > Cr-Commit-Position: refs/heads/master@{#542296} TBR=wez@chromium.org,rockot@chromium.org # Not skipping CQ checks because original CL landed > 1 day ago. Bug: 816620 Change-Id: I534fa23c12072cf08507ed4c8073e7505edaf262 Reviewed-on: https://chromium-review.googlesource.com/959241 Reviewed-by: Wez <wez@chromium.org> Commit-Queue: Wez <wez@chromium.org> Cr-Commit-Position: refs/heads/master@{#542526} [modify] https://crrev.com/6d1fd84a5ea8e6f47a83cfd5cac5f39221836370/ipc/ipc_mojo_bootstrap_unittest.cc
,
Mar 12 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/e960518251867845ed23bf2ab8f042a77afe1885 commit e960518251867845ed23bf2ab8f042a77afe1885 Author: Wez <wez@chromium.org> Date: Mon Mar 12 22:40:29 2018 Delay Channel::OnError() in case of kDisconnected during Write(). Write() operations to a Channel can fail due to the peer having closed it, while there are still messages waiting to be read from it. We must therefore defer notifying the caller of the Channel::Error until we observe end-of-stream via a readable notification, otherwise those messages may be dropped (depending on whether the posted OnError task is processed before or after a pending Channel-readable event). Bug: 816620 Change-Id: I75bd34a48edf4022809d27ce49f9cfba7a5d4daf Reviewed-on: https://chromium-review.googlesource.com/956932 Commit-Queue: Wez <wez@chromium.org> Reviewed-by: Ken Rockot <rockot@chromium.org> Cr-Commit-Position: refs/heads/master@{#542634} [modify] https://crrev.com/e960518251867845ed23bf2ab8f042a77afe1885/mojo/edk/system/channel_fuchsia.cc [modify] https://crrev.com/e960518251867845ed23bf2ab8f042a77afe1885/mojo/edk/system/channel_posix.cc [modify] https://crrev.com/e960518251867845ed23bf2ab8f042a77afe1885/mojo/edk/system/channel_unittest.cc [modify] https://crrev.com/e960518251867845ed23bf2ab8f042a77afe1885/mojo/edk/system/channel_win.cc
,
Mar 12 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/3fe9c08982644207abfeae5e02ffb90a74309d46 commit 3fe9c08982644207abfeae5e02ffb90a74309d46 Author: Wez <wez@chromium.org> Date: Mon Mar 12 23:19:23 2018 Reland "Add missing message-received expectation to IPCMojoBootstrapTest." This is a reland of cdcb02624c340873b4d7d6e2564666e60f3aa856, which was reverted pending a fix for an underlying Mojo message-loss issue on channel shutdown. Original change's description: > Add missing message-received expectation to IPCMojoBootstrapTest. > > Some of these tests expected a message from the child process, and would > verify that it was valid/invalid as expected, but would still pass if > no message were received at all. > > Bug: 816620 > Change-Id: Ie8acf85086f4416c7023118673c40d00a2e190d7 > Reviewed-on: https://chromium-review.googlesource.com/955554 > Reviewed-by: Ken Rockot <rockot@chromium.org> > Commit-Queue: Wez <wez@chromium.org> > Cr-Commit-Position: refs/heads/master@{#542296} TBR: rockot Bug: 816620 Change-Id: Ifdf10581091be31a6c9d1aa53542f2ed6c2ccbfa Reviewed-on: https://chromium-review.googlesource.com/958016 Reviewed-by: Wez <wez@chromium.org> Commit-Queue: Wez <wez@chromium.org> Cr-Commit-Position: refs/heads/master@{#542650} [modify] https://crrev.com/3fe9c08982644207abfeae5e02ffb90a74309d46/ipc/ipc_mojo_bootstrap_unittest.cc
,
Mar 12 2018
Remaining work is to un-filter tests under Fuchsia; and re-enable Mojo/IPC tests that were previously disabled as flaky due to this failure mode.
,
Mar 13 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/6a84b3ed1c97d1fad1ebf401dc937e806cc895b7 commit 6a84b3ed1c97d1fad1ebf401dc937e806cc895b7 Author: Wez <wez@chromium.org> Date: Tue Mar 13 00:16:23 2018 Un-filter some Mojo and IPC tests under Fuchsia. These tests were flakey due to messages being dropped by the receiver occasionally when the underlying plaform-specific Channel was closed. Bug: 764015 , 764018 , 756112 , 810448 , 815314 , 816620 Change-Id: I2b6cfb105660a570c09b29555da7811541d49ec9 Reviewed-on: https://chromium-review.googlesource.com/959674 Commit-Queue: Wez <wez@chromium.org> Commit-Queue: John Budorick <jbudorick@chromium.org> Reviewed-by: John Budorick <jbudorick@chromium.org> Cr-Commit-Position: refs/heads/master@{#542661} [modify] https://crrev.com/6a84b3ed1c97d1fad1ebf401dc937e806cc895b7/testing/buildbot/filters/fuchsia.ipc_tests.filter [modify] https://crrev.com/6a84b3ed1c97d1fad1ebf401dc937e806cc895b7/testing/buildbot/filters/fuchsia.mojo_unittests.filter
,
Mar 13 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/5d5fef1259d675889ed0767ed1002f4695520870 commit 5d5fef1259d675889ed0767ed1002f4695520870 Author: Findit <findit-for-me@appspot.gserviceaccount.com> Date: Tue Mar 13 02:29:08 2018 Revert "Delay Channel::OnError() in case of kDisconnected during Write()." This reverts commit e960518251867845ed23bf2ab8f042a77afe1885. Reason for revert: Findit (https://goo.gl/kROfz5) identified CL at revision 542634 as the culprit for failures in the build cycles as shown on: https://findit-for-me.appspot.com/waterfall/culprit?key=ag9zfmZpbmRpdC1mb3ItbWVyRAsSDVdmU3VzcGVjdGVkQ0wiMWNocm9taXVtL2U5NjA1MTgyNTE4Njc4NDVlZDIzYmYyYWI4ZjA0MmE3N2FmZTE4ODUM Sample Failed Build: https://ci.chromium.org/buildbot/chromium.memory/Linux%20MSan%20Tests/8479 Sample Failed Step: mojo_unittests Original change's description: > Delay Channel::OnError() in case of kDisconnected during Write(). > > Write() operations to a Channel can fail due to the peer having closed > it, while there are still messages waiting to be read from it. We must > therefore defer notifying the caller of the Channel::Error until we > observe end-of-stream via a readable notification, otherwise those > messages may be dropped (depending on whether the posted OnError task > is processed before or after a pending Channel-readable event). > > Bug: 816620 > Change-Id: I75bd34a48edf4022809d27ce49f9cfba7a5d4daf > Reviewed-on: https://chromium-review.googlesource.com/956932 > Commit-Queue: Wez <wez@chromium.org> > Reviewed-by: Ken Rockot <rockot@chromium.org> > Cr-Commit-Position: refs/heads/master@{#542634} Change-Id: I9d212ecb8dde1e90422b60e950547e823357f0d4 No-Presubmit: true No-Tree-Checks: true No-Try: true Bug: 816620 Reviewed-on: https://chromium-review.googlesource.com/959761 Cr-Commit-Position: refs/heads/master@{#542698} [modify] https://crrev.com/5d5fef1259d675889ed0767ed1002f4695520870/mojo/edk/system/channel_fuchsia.cc [modify] https://crrev.com/5d5fef1259d675889ed0767ed1002f4695520870/mojo/edk/system/channel_posix.cc [modify] https://crrev.com/5d5fef1259d675889ed0767ed1002f4695520870/mojo/edk/system/channel_unittest.cc [modify] https://crrev.com/5d5fef1259d675889ed0767ed1002f4695520870/mojo/edk/system/channel_win.cc
,
Mar 13 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/3e64a8a06daa346ccdd75dac3befaa16cee24def commit 3e64a8a06daa346ccdd75dac3befaa16cee24def Author: Wez <wez@chromium.org> Date: Tue Mar 13 05:36:06 2018 Reland "Delay Channel::OnError() in case of kDisconnected during Write()." This is a reland of e960518251867845ed23bf2ab8f042a77afe1885 The original CL added a unit-test which created a single-byte Channel::Message, without actually initializing that single-byte. This caused the MSAN bots to (correctly) spot that uninitialized data was being read during serialization. Original change's description: > Delay Channel::OnError() in case of kDisconnected during Write(). > > Write() operations to a Channel can fail due to the peer having closed > it, while there are still messages waiting to be read from it. We must > therefore defer notifying the caller of the Channel::Error until we > observe end-of-stream via a readable notification, otherwise those > messages may be dropped (depending on whether the posted OnError task > is processed before or after a pending Channel-readable event). > > Bug: 816620 > Change-Id: I75bd34a48edf4022809d27ce49f9cfba7a5d4daf > Reviewed-on: https://chromium-review.googlesource.com/956932 > Commit-Queue: Wez <wez@chromium.org> > Reviewed-by: Ken Rockot <rockot@chromium.org> > Cr-Commit-Position: refs/heads/master@{#542634} TBR: rockot Bug: 816620 Change-Id: I1a1d6eb7fa712e50b3d9c86591878900f0aeb388 Reviewed-on: https://chromium-review.googlesource.com/959762 Reviewed-by: Wez <wez@chromium.org> Commit-Queue: Wez <wez@chromium.org> Cr-Commit-Position: refs/heads/master@{#542739} [modify] https://crrev.com/3e64a8a06daa346ccdd75dac3befaa16cee24def/mojo/edk/system/channel_fuchsia.cc [modify] https://crrev.com/3e64a8a06daa346ccdd75dac3befaa16cee24def/mojo/edk/system/channel_posix.cc [modify] https://crrev.com/3e64a8a06daa346ccdd75dac3befaa16cee24def/mojo/edk/system/channel_unittest.cc [modify] https://crrev.com/3e64a8a06daa346ccdd75dac3befaa16cee24def/mojo/edk/system/channel_win.cc
,
Mar 14 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/0021badf45558cc13b25dfb9da7b3d671d464c5f commit 0021badf45558cc13b25dfb9da7b3d671d464c5f Author: Wez <wez@chromium.org> Date: Wed Mar 14 19:45:42 2018 Revert to manual ref-counting in mojo::edk::ChannelWin. Changes to manual ref-counting in a recent change to the timing of write-error notifications in ChannelWin missed that multiple write operations may be in-progress at the same time, and so introduced issues with the lifetime of the ChannelWin object if ShutDown() while write operations were in-progress. We also add some DCHECKs to sanity-check the pending-IO flags. Bug: 821658, 816620 Change-Id: I69cea0a1ed080ca2920738f5212a25d5f5e63c42 Reviewed-on: https://chromium-review.googlesource.com/961941 Commit-Queue: Wez <wez@chromium.org> Reviewed-by: Ken Rockot <rockot@chromium.org> Cr-Commit-Position: refs/heads/master@{#543162} [modify] https://crrev.com/0021badf45558cc13b25dfb9da7b3d671d464c5f/mojo/edk/system/channel_win.cc
,
Mar 15 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/cd568190cfdef377b434f31c56a7ac102e66d0cc commit cd568190cfdef377b434f31c56a7ac102e66d0cc Author: Wez <wez@chromium.org> Date: Thu Mar 15 16:12:32 2018 Re-enable a test that we suspect flaked due to message-loss. Bug: 816620 Change-Id: I25d8b9d627578946a6c4adb2c220c9401f21c59b Reviewed-on: https://chromium-review.googlesource.com/959677 Reviewed-by: Ken Rockot <rockot@chromium.org> Commit-Queue: Wez <wez@chromium.org> Cr-Commit-Position: refs/heads/master@{#543386} [modify] https://crrev.com/cd568190cfdef377b434f31c56a7ac102e66d0cc/ipc/ipc_channel_proxy_unittest.cc
,
Mar 16 2018
Not aware of any further formerly-flaky tests that are candidates for re-enabling following this fix, so closing out.
,
May 30 2018
Issue 819358 has been merged into this issue. |
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by w...@chromium.org
, Feb 26 2018