DCHECK hit in ProcessMap::OnServicePIDReceived |
||||||||
Issue descriptionlaunching my local build (at r540792 with r540266 reverted) crashes on this DCHECK: void ProcessMap::OnServicePIDReceived(const service_manager::Identity& identity, uint32_t pid) { if (pid == base::kNullProcessId) return; auto it_and_inserted = instances_.emplace(identity, pid); DCHECK(it_and_inserted.second); } 0:003:x86> k # ChildEBP RetAddr 00 0764e4b8 6f3bb0a9 base!base::debug::BreakDebugger+0x9 [C:\src\chromium\src\base\debug\debugger_win.cc @ 21] 01 0764e9d8 68870c0a base!logging::LogMessage::~LogMessage+0x389 [C:\src\chromium\src\base\logging.cc @ 842] 02 0764eaa8 6667f772 content!memory_instrumentation::ProcessMap::OnServicePIDReceived+0x68 [C:\src\chromium\src\services\resource_coordinator\memory_instrumentation\process_map.cc @ 61] 03 0764eba4 6887122e service_manager_mojom!service_manager::mojom::ServiceManagerListenerStubDispatch::Accept+0x550 [C:\src\chromium\src\out\rel\gen\services\service_manager\public\mojom\service_manager.mojom.cc @ 420] 04 0764ebb4 6f31d634 content!service_manager::mojom::ServiceManagerListenerStub<mojo::RawPtrImplRefTraits<service_manager::mojom::ServiceManagerListener> >::Accept+0x14 [C:\src\chromium\src\out\rel\gen\services\service_manager\public\mojom\service_manager.mojom.h @ 241] 05 0764ec94 6f31d09b bindings!mojo::InterfaceEndpointClient::HandleValidatedMessage+0x22a [C:\src\chromium\src\mojo\public\cpp\bindings\lib\interface_endpoint_client.cc @ 419] 06 0764ed6c 6f31e8b3 bindings!mojo::FilterChain::Accept+0x85 [C:\src\chromium\src\mojo\public\cpp\bindings\lib\filter_chain.cc @ 40] 07 0764ee3c 6f32439d bindings!mojo::InterfaceEndpointClient::HandleIncomingMessage+0x6d [C:\src\chromium\src\mojo\public\cpp\bindings\lib\interface_endpoint_client.cc @ 306] 08 0764ef1c 6f323ddb bindings!mojo::internal::MultiplexRouter::ProcessIncomingMessage+0x353 [C:\src\chromium\src\mojo\public\cpp\bindings\lib\multiplex_router.cc @ 881] 09 0764f014 6f31d09b bindings!mojo::internal::MultiplexRouter::Accept+0x131 [C:\src\chromium\src\mojo\public\cpp\bindings\lib\multiplex_router.cc @ 608] 0a 0764f0ec 6f3178fc bindings!mojo::FilterChain::Accept+0x85 [C:\src\chromium\src\mojo\public\cpp\bindings\lib\filter_chain.cc @ 40] 0b 0764f234 6f3182e2 bindings!mojo::Connector::ReadSingleMessage+0x178 [C:\src\chromium\src\mojo\public\cpp\bindings\lib\connector.cc @ 444] 0c 0764f320 6f318153 bindings!mojo::Connector::ReadAllAvailableMessages+0x58 [C:\src\chromium\src\mojo\public\cpp\bindings\lib\connector.cc @ 474] 0d 0764f3ec 6f318e85 bindings!mojo::Connector::OnHandleReadyInternal+0x81 [C:\src\chromium\src\mojo\public\cpp\bindings\lib\connector.cc @ 377] 0e 0764f3f8 6f672310 bindings!base::internal::Invoker<base::internal::BindState<void (mojo::Connector::*)(unsigned int) __attribute__((thiscall)),base::internal::UnretainedWrapper<mojo::Connector> >,void (unsigned int)>::Run+0xf [C:\src\chromium\src\base\bind_internal.h @ 589] 0f 0764f410 6f674ff3 mojo_public_system_cpp!mojo::SimpleWatcher::DiscardReadyState+0x18 [C:\src\chromium\src\mojo\public\cpp\system\simple_watcher.h @ 194] 10 0764f424 6f6776c6 mojo_public_system_cpp!base::internal::Invoker<base::internal::BindState<void (*)(const base::RepeatingCallback<void (unsigned int)> &, unsigned int, const mojo::HandleSignalsState &),base::RepeatingCallback<void (unsigned int)> >,void (unsigned int, const mojo::HandleSignalsState &)>::Run+0x13 [C:\src\chromium\src\base\bind_internal.h @ 586] 11 0764f508 6f677ac9 mojo_public_system_cpp!mojo::SimpleWatcher::OnHandleReady+0xe4 [C:\src\chromium\src\mojo\public\cpp\system\simple_watcher.cc @ 276] 12 0764f52c 6f38c4ed mojo_public_system_cpp!base::internal::Invoker<base::internal::BindState<void (mojo::SimpleWatcher::*)(int, unsigned int, const mojo::HandleSignalsState &) __attribute__((thiscall)),base::WeakPtr<mojo::SimpleWatcher>,int,unsigned int,mojo::HandleSignalsState>,void ()>::Run+0x3b [C:\src\chromium\src\base\bind_internal.h @ 589] 13 0764f640 6f3d355f base!base::debug::TaskAnnotator::RunTask+0xed [C:\src\chromium\src\base\debug\task_annotator.cc @ 61] 14 0764f784 6f3d3c3d base!base::MessageLoop::RunTask+0x27f [C:\src\chromium\src\base\message_loop\message_loop.cc @ 396] 15 0764f860 6f3d3e8a base!base::MessageLoop::DeferOrRunPendingTask+0x9d [C:\src\chromium\src\base\message_loop\message_loop.cc @ 407] 16 0764f9bc 6f3d5d05 base!base::MessageLoop::DoWork+0x1fa [C:\src\chromium\src\base\message_loop\message_loop.cc @ 451] 17 0764f9d0 6f3d4e6e base!base::MessagePumpForIO::DoRunLoop+0x85 [C:\src\chromium\src\base\message_loop\message_pump_win.cc @ 476] 18 0764fa04 6f3d2d69 base!base::MessagePumpWin::Run+0x6e [C:\src\chromium\src\base\message_loop\message_pump_win.cc @ 58] 19 0764fadc 6f41f29f base!base::MessageLoop::Run+0xa9 [C:\src\chromium\src\base\message_loop\message_loop.cc @ 348] 1a 0764fbb0 6f4766c4 base!base::RunLoop::Run+0xcf [C:\src\chromium\src\base\run_loop.cc @ 136] 1b 0764fc7c 67f80179 base!base::Thread::Run+0xa4 [C:\src\chromium\src\base\threading\thread.cc @ 255] 1c 0764fd4c 67f802c2 content!content::BrowserThreadImpl::IOThreadRun+0x2d [C:\src\chromium\src\content\browser\browser_thread_impl.cc @ 234] 1d 0764fe24 6f476ca7 content!content::BrowserThreadImpl::Run+0xc4 [C:\src\chromium\src\content\browser\browser_thread_impl.cc @ 260] 1e 0764ff08 6f471e85 base!base::Thread::ThreadMain+0x277 [C:\src\chromium\src\base\threading\thread.cc @ 341] 1f 0764ff2c 76a68654 base!base::`anonymous namespace'::ThreadFunc+0xa5 [C:\src\chromium\src\base\threading\platform_thread_win.cc @ 94] 20 0764ff40 77b64a47 KERNEL32!BaseThreadInitThunk+0x24 21 0764ff88 77b64a17 ntdll_77b00000!__RtlUserThreadStart+0x2f 22 0764ff98 00000000 ntdll_77b00000!_RtlUserThreadStart+0x1b
,
Mar 6 2018
it repros locally for me too at r540933.
,
Mar 7 2018
It does not repro anymore for me at r541226.
,
Mar 8 2018
Was it crashing at startup? I'm trying to revert the CL mentioned in c0 (in https://chromium-review.googlesource.com/c/chromium/src/+/955923) and I'm not hitting this DCHECK.
,
Mar 9 2018
Just saw this crash on a debug build on r541186. It was not on startup - it happened while I left the browser open during lunch.
,
May 7 2018
,
Sep 22
I am seeing this happening repeatedly. I'm running a Chromium build, I don't think I'm doing anything particularly interesting other than leaving it running for several minutes.
Here is most of the Autos window from Visual Studio - I guess the patch_service was already inserted and we tried to insert it again? I'm going to comment out the DCHECK for now.
+ identity {name_="patch_service" user_id_="505C0EE9-3013-43C0-82B0-A84F50CF8D84" instance_="" } const service_manager::Identity &
- instances_ { size=15 } std::map<service_manager::Identity,unsigned long,std::less<service_manager::Identity>,std::allocator<std::pair<const service_manager::Identity,unsigned long> > >
+ [comparator] less std::_Compressed_pair<std::less<service_manager::Identity>,std::_Compressed_pair<std::allocator<std::_Tree_node<std::pair<const service_manager::Identity,unsigned long>,void *> >,std::_Tree_val<std::_Tree_simple_types<std::pair<const service_manager::Identity,unsigned long> > >,1>,1>
+ [allocator] allocator std::_Compressed_pair<std::allocator<std::_Tree_node<std::pair<const service_manager::Identity,unsigned long>,void *> >,std::_Tree_val<std::_Tree_simple_types<std::pair<const service_manager::Identity,unsigned long> > >,1>
+ [{name_="audio" user_id_="505C0EE9-3013-43C0-82B0-A84F50CF8D84" instance_="" }] 52376 std::pair<const service_manager::Identity,unsigned long>
+ [{name_="catalog" user_id_="505C0EE9-3013-43C0-82B0-A84F50CF8D84" instance_="" }] 34772 std::pair<const service_manager::Identity,unsigned long>
+ [{name_="content_browser" user_id_="505C0EE9-3013-43C0-82B0-A84F50CF8D84" instance_="" }] 34772 std::pair<const service_manager::Identity,unsigned long>
+ [{name_="content_browser" user_id_="e19ff48e-cfcc-4b4f-9589-73e7942a0bf2" instance_="" }] 34772 std::pair<const service_manager::Identity,unsigned long>
+ [{name_="content_gpu" user_id_="505C0EE9-3013-43C0-82B0-A84F50CF8D84" instance_="2" }] 27704 std::pair<const service_manager::Identity,unsigned long>
+ [{name_="content_packaged_services" user_id_="205fc0d0-60d7-4772-a57d-3228a98adc2c" instance_="" }] 34772 std::pair<const service_manager::Identity,unsigned long>
+ [{name_="content_renderer" user_id_="e19ff48e-cfcc-4b4f-9589-73e7942a0bf2" instance_="11_1" }] 43564 std::pair<const service_manager::Identity,unsigned long>
+ [{name_="content_renderer" user_id_="e19ff48e-cfcc-4b4f-9589-73e7942a0bf2" instance_="12_1" }] 19892 std::pair<const service_manager::Identity,unsigned long>
+ [{name_="content_renderer" user_id_="e19ff48e-cfcc-4b4f-9589-73e7942a0bf2" instance_="13_1" }] 52808 std::pair<const service_manager::Identity,unsigned long>
+ [{name_="content_utility" user_id_="505C0EE9-3013-43C0-82B0-A84F50CF8D84" instance_="22" }] 19572 std::pair<const service_manager::Identity,unsigned long>
+ [{name_="content_utility" user_id_="505C0EE9-3013-43C0-82B0-A84F50CF8D84" instance_="5" }] 9088 std::pair<const service_manager::Identity,unsigned long>
+ [{name_="content_utility" user_id_="505C0EE9-3013-43C0-82B0-A84F50CF8D84" instance_="9" }] 52376 std::pair<const service_manager::Identity,unsigned long>
+ [{name_="patch_service" user_id_="505C0EE9-3013-43C0-82B0-A84F50CF8D84" instance_="" }] 19572 std::pair<const service_manager::Identity,unsigned long>
+ [{name_="proxy_resolver" user_id_="505C0EE9-3013-43C0-82B0-A84F50CF8D84" instance_="" }] 9088 std::pair<const service_manager::Identity,unsigned long>
+ [{name_="service_manager" user_id_="505C0EE9-3013-43C0-82B0-A84F50CF8D84" instance_="" }] 34772 std::pair<const service_manager::Identity,unsigned long>
+ [Raw View] {...} std::map<service_manager::Identity,unsigned long,std::less<service_manager::Identity>,std::allocator<std::pair<const service_manager::Identity,unsigned long> > >
+ it_and_inserted (({name_="patch_service" user_id_="505C0EE9-3013-43C0-82B0-A84F50CF8D84" instance_="" }, 19572), false) std::pair<std::_Tree_iterator<std::_Tree_val<std::_Tree_simple_types<std::pair<const service_manager::Identity,unsigned long> > > >,bool>
it_and_inserted.second false bool
pid 9784 unsigned int
,
Sep 24
More specifically, the failure happened because an item with:
identity: {name_="patch_service" user_id_="505C0EE9-3013-43C0-82B0-A84F50CF8D84" instance_="" }
was inserted when it already existed in the map (duplicate names are fine, duplicate identities are not).
Is this useful information? Is there additional data that I should try to grab?
As I said, I wasn't doing anything particularly unusual, just running a debug build and letting it sit for a few minutes. It would be great to get this fixed, both to fix the presumed correctness problems and to avoid having this DCHECK fire while investigating other bugs.
,
Sep 26
+rockot@, is it expected that we get to see two services with the same identity? That seems quite unexpected to me. Is this a bug of the Service Manager callbacks? Bruce, thanks for filing this and dumping the identity strings, should definitely help. Moving over to Lalit who is more familiar with that code.
,
Sep 26
Yeah unfortunately I think the API is racy. If an instance $id dies or becomes unreachable by the service manager, and a new connection to the same $id is requested immediately after that, a new instance with $id will be created. It's then possible for a ServiceMangaerListener to receive the new instance's OnServicePIDReceived before it receives the old instance's OnServiceStopped. This same essential problem is covered by bug 840127 .
,
Oct 11
This hit again, somebody pinged me because tryjobs were failing on [1]. The fact that the ServiceMangaerListener API is racy is really unfortunate. For the moment I'm going to force-replace entries and remove that dcheck, but I guess at some point we'll pay this and some metrics will be skewed if something else will go wrong and undetected. [1] https://logs.chromium.org/logs/chromium/buildbucket/cr-buildbucket.appspot.com/8932942268393419584/+/steps/browser_tests__with_patch_/0/stdout
,
Oct 11
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/5203a4a475d0eb4c10ddacd9728ca6803c07487f commit 5203a4a475d0eb4c10ddacd9728ca6803c07487f Author: Kyle Horimoto <khorimoto@google.com> Date: Thu Oct 11 22:06:43 2018 ProcessMap: Work around racy service listener. Previously, we would DCHECK() that the ProcessMap did not contain an identity when initializing it. However, this led to some false-positives which resulted in the map containing spurious entries. This CL updates ProcessMap::OnInit() to work around this issue by removing any existing entries for a given identity before trying to add new ones. Bug: 818593 , 884066 Change-Id: Ic69af1f6b622dc73dde8fb09ad659b887857e384 Reviewed-on: https://chromium-review.googlesource.com/c/1277808 Commit-Queue: Kyle Horimoto <khorimoto@chromium.org> Reviewed-by: Primiano Tucci <primiano@chromium.org> Cr-Commit-Position: refs/heads/master@{#598972} [modify] https://crrev.com/5203a4a475d0eb4c10ddacd9728ca6803c07487f/services/resource_coordinator/memory_instrumentation/process_map.cc
,
Oct 12
Issue 893253 has been merged into this issue.
,
Oct 17
,
Nov 13
This is still happening. It just happened on a repo that is up-to-date to November 9th, a month after the presumed fix was landed. I think we should remove the DCHECK until we figure out why it is being flaky (and reinstate it as a CHECK, or not at all). crrev.com/c/1334004 is the CL I just created.
,
Nov 13
Coincidentally, I landed https://crrev.com/606960 on Nov 9. This should actually fix the problem, because it ensures that no exactly identical value of Identity is ever reused.
,
Nov 13
I'll take this, but I suspect we can close it. Let's see what happens once the CHECK lands.
,
Nov 13
Oh, wait, I assumed your CL was changing the DCHECK to a CHECK. :) Maybe we don't need to land it now? Can you try syncing?
,
Nov 15
Since https://crrev.com/606960 should actually fix it I will abandon my change and resurrect if if I see the DCHECK again. Thanks for the fix. It is still mysterious that I was able to hit the DCHECK given the workaround landed with crrev.com/c/1277808. I can't explain that (possibly user error?) but all we can do is watch for future issues - or perhaps change the DCHECK to a CHECK.
,
Nov 27
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/3bdf26ad8f19ee9b629e09eb04ecc95c870fe772 commit 3bdf26ad8f19ee9b629e09eb04ecc95c870fe772 Author: Ken Rockot <rockot@google.com> Date: Tue Nov 27 18:10:27 2018 Fix raciness in AudioServiceListener This fixes potential race conditions in AudioServiceListener resulting from the fact that a service instance can become unreachable (and may therefore be replaced with a new one) before it is officially designated as "stopped" by the Service Manager, even in the case of singleton services. The ServiceManagerListener API is augmented to convey a service instance's current running state via RunningServiceInfo. This is used by AudioServiceListener to more accurately track the running audio service instance. The listener now tracks the Identity of the most recently created audio service instance and responds to events accordingly. This also updates the logic and comments in resource_coordinator's ProcessMap to accurately reflect current guarantees of the ServiceManagerListener API, since these have changed slightly. NOTRY=true Bug: 818593 , 907898 ,908319 Change-Id: If3e8a35aa90db72a34c8f5f2d458f0f24c3447eb Reviewed-on: https://chromium-review.googlesource.com/c/1350649 Commit-Queue: Ken Rockot <rockot@google.com> Reviewed-by: Olga Sharonova <olka@chromium.org> Reviewed-by: Tom Sepez <tsepez@chromium.org> Reviewed-by: oysteine <oysteine@chromium.org> Cr-Commit-Position: refs/heads/master@{#611193} [modify] https://crrev.com/3bdf26ad8f19ee9b629e09eb04ecc95c870fe772/content/browser/renderer_host/media/audio_service_listener.cc [modify] https://crrev.com/3bdf26ad8f19ee9b629e09eb04ecc95c870fe772/content/browser/renderer_host/media/audio_service_listener.h [modify] https://crrev.com/3bdf26ad8f19ee9b629e09eb04ecc95c870fe772/content/browser/renderer_host/media/audio_service_listener_unittest.cc [modify] https://crrev.com/3bdf26ad8f19ee9b629e09eb04ecc95c870fe772/services/resource_coordinator/memory_instrumentation/process_map.cc [modify] https://crrev.com/3bdf26ad8f19ee9b629e09eb04ecc95c870fe772/services/service_manager/public/mojom/service_manager.mojom [modify] https://crrev.com/3bdf26ad8f19ee9b629e09eb04ecc95c870fe772/services/service_manager/service_manager.cc [modify] https://crrev.com/3bdf26ad8f19ee9b629e09eb04ecc95c870fe772/services/service_manager/service_manager.h
,
Nov 27
|
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by hjd@chromium.org
, Mar 5 2018