New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 818593 link

Starred by 7 users

Issue metadata

Status: Fixed
Owner:
Closed: Nov 27
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 2
Type: Bug

Blocked on:
issue 840127



Sign in to add a comment

DCHECK hit in ProcessMap::OnServicePIDReceived

Project Member Reported by grt@chromium.org, Mar 5 2018

Issue description

launching my local build (at r540792 with r540266 reverted) crashes on this DCHECK:

void ProcessMap::OnServicePIDReceived(const service_manager::Identity& identity,
                                      uint32_t pid) {
  if (pid == base::kNullProcessId)
    return;
  auto it_and_inserted = instances_.emplace(identity, pid);
  DCHECK(it_and_inserted.second);
}


0:003:x86> k
 # ChildEBP RetAddr  
00 0764e4b8 6f3bb0a9 base!base::debug::BreakDebugger+0x9 [C:\src\chromium\src\base\debug\debugger_win.cc @ 21]
01 0764e9d8 68870c0a base!logging::LogMessage::~LogMessage+0x389 [C:\src\chromium\src\base\logging.cc @ 842]
02 0764eaa8 6667f772 content!memory_instrumentation::ProcessMap::OnServicePIDReceived+0x68 [C:\src\chromium\src\services\resource_coordinator\memory_instrumentation\process_map.cc @ 61]
03 0764eba4 6887122e service_manager_mojom!service_manager::mojom::ServiceManagerListenerStubDispatch::Accept+0x550 [C:\src\chromium\src\out\rel\gen\services\service_manager\public\mojom\service_manager.mojom.cc @ 420]
04 0764ebb4 6f31d634 content!service_manager::mojom::ServiceManagerListenerStub<mojo::RawPtrImplRefTraits<service_manager::mojom::ServiceManagerListener> >::Accept+0x14 [C:\src\chromium\src\out\rel\gen\services\service_manager\public\mojom\service_manager.mojom.h @ 241]
05 0764ec94 6f31d09b bindings!mojo::InterfaceEndpointClient::HandleValidatedMessage+0x22a [C:\src\chromium\src\mojo\public\cpp\bindings\lib\interface_endpoint_client.cc @ 419]
06 0764ed6c 6f31e8b3 bindings!mojo::FilterChain::Accept+0x85 [C:\src\chromium\src\mojo\public\cpp\bindings\lib\filter_chain.cc @ 40]
07 0764ee3c 6f32439d bindings!mojo::InterfaceEndpointClient::HandleIncomingMessage+0x6d [C:\src\chromium\src\mojo\public\cpp\bindings\lib\interface_endpoint_client.cc @ 306]
08 0764ef1c 6f323ddb bindings!mojo::internal::MultiplexRouter::ProcessIncomingMessage+0x353 [C:\src\chromium\src\mojo\public\cpp\bindings\lib\multiplex_router.cc @ 881]
09 0764f014 6f31d09b bindings!mojo::internal::MultiplexRouter::Accept+0x131 [C:\src\chromium\src\mojo\public\cpp\bindings\lib\multiplex_router.cc @ 608]
0a 0764f0ec 6f3178fc bindings!mojo::FilterChain::Accept+0x85 [C:\src\chromium\src\mojo\public\cpp\bindings\lib\filter_chain.cc @ 40]
0b 0764f234 6f3182e2 bindings!mojo::Connector::ReadSingleMessage+0x178 [C:\src\chromium\src\mojo\public\cpp\bindings\lib\connector.cc @ 444]
0c 0764f320 6f318153 bindings!mojo::Connector::ReadAllAvailableMessages+0x58 [C:\src\chromium\src\mojo\public\cpp\bindings\lib\connector.cc @ 474]
0d 0764f3ec 6f318e85 bindings!mojo::Connector::OnHandleReadyInternal+0x81 [C:\src\chromium\src\mojo\public\cpp\bindings\lib\connector.cc @ 377]
0e 0764f3f8 6f672310 bindings!base::internal::Invoker<base::internal::BindState<void (mojo::Connector::*)(unsigned int) __attribute__((thiscall)),base::internal::UnretainedWrapper<mojo::Connector> >,void (unsigned int)>::Run+0xf [C:\src\chromium\src\base\bind_internal.h @ 589]
0f 0764f410 6f674ff3 mojo_public_system_cpp!mojo::SimpleWatcher::DiscardReadyState+0x18 [C:\src\chromium\src\mojo\public\cpp\system\simple_watcher.h @ 194]
10 0764f424 6f6776c6 mojo_public_system_cpp!base::internal::Invoker<base::internal::BindState<void (*)(const base::RepeatingCallback<void (unsigned int)> &, unsigned int, const mojo::HandleSignalsState &),base::RepeatingCallback<void (unsigned int)> >,void (unsigned int, const mojo::HandleSignalsState &)>::Run+0x13 [C:\src\chromium\src\base\bind_internal.h @ 586]
11 0764f508 6f677ac9 mojo_public_system_cpp!mojo::SimpleWatcher::OnHandleReady+0xe4 [C:\src\chromium\src\mojo\public\cpp\system\simple_watcher.cc @ 276]
12 0764f52c 6f38c4ed mojo_public_system_cpp!base::internal::Invoker<base::internal::BindState<void (mojo::SimpleWatcher::*)(int, unsigned int, const mojo::HandleSignalsState &) __attribute__((thiscall)),base::WeakPtr<mojo::SimpleWatcher>,int,unsigned int,mojo::HandleSignalsState>,void ()>::Run+0x3b [C:\src\chromium\src\base\bind_internal.h @ 589]
13 0764f640 6f3d355f base!base::debug::TaskAnnotator::RunTask+0xed [C:\src\chromium\src\base\debug\task_annotator.cc @ 61]
14 0764f784 6f3d3c3d base!base::MessageLoop::RunTask+0x27f [C:\src\chromium\src\base\message_loop\message_loop.cc @ 396]
15 0764f860 6f3d3e8a base!base::MessageLoop::DeferOrRunPendingTask+0x9d [C:\src\chromium\src\base\message_loop\message_loop.cc @ 407]
16 0764f9bc 6f3d5d05 base!base::MessageLoop::DoWork+0x1fa [C:\src\chromium\src\base\message_loop\message_loop.cc @ 451]
17 0764f9d0 6f3d4e6e base!base::MessagePumpForIO::DoRunLoop+0x85 [C:\src\chromium\src\base\message_loop\message_pump_win.cc @ 476]
18 0764fa04 6f3d2d69 base!base::MessagePumpWin::Run+0x6e [C:\src\chromium\src\base\message_loop\message_pump_win.cc @ 58]
19 0764fadc 6f41f29f base!base::MessageLoop::Run+0xa9 [C:\src\chromium\src\base\message_loop\message_loop.cc @ 348]
1a 0764fbb0 6f4766c4 base!base::RunLoop::Run+0xcf [C:\src\chromium\src\base\run_loop.cc @ 136]
1b 0764fc7c 67f80179 base!base::Thread::Run+0xa4 [C:\src\chromium\src\base\threading\thread.cc @ 255]
1c 0764fd4c 67f802c2 content!content::BrowserThreadImpl::IOThreadRun+0x2d [C:\src\chromium\src\content\browser\browser_thread_impl.cc @ 234]
1d 0764fe24 6f476ca7 content!content::BrowserThreadImpl::Run+0xc4 [C:\src\chromium\src\content\browser\browser_thread_impl.cc @ 260]
1e 0764ff08 6f471e85 base!base::Thread::ThreadMain+0x277 [C:\src\chromium\src\base\threading\thread.cc @ 341]
1f 0764ff2c 76a68654 base!base::`anonymous namespace'::ThreadFunc+0xa5 [C:\src\chromium\src\base\threading\platform_thread_win.cc @ 94]
20 0764ff40 77b64a47 KERNEL32!BaseThreadInitThunk+0x24
21 0764ff88 77b64a17 ntdll_77b00000!__RtlUserThreadStart+0x2f
22 0764ff98 00000000 ntdll_77b00000!_RtlUserThreadStart+0x1b
 

Comment 1 by hjd@chromium.org, Mar 5 2018

Cc: primiano@chromium.org lalitm@chromium.org
+cc: primiano & lalitm

Comment 2 by sorin@chromium.org, Mar 6 2018

it repros locally for me too at r540933.

Comment 3 by sorin@chromium.org, Mar 7 2018

It does not repro anymore for me at r541226.
Was it crashing at startup? I'm trying to revert the CL mentioned in c0 (in https://chromium-review.googlesource.com/c/chromium/src/+/955923) and I'm not hitting this DCHECK.
Just saw this crash on a debug build on r541186.
It was not on startup - it happened while I left the browser open during lunch.
Blocking: 840127
I am seeing this happening repeatedly. I'm running a Chromium build, I don't think I'm doing anything particularly interesting other than leaving it running for several minutes.

Here is most of the Autos window from Visual Studio - I guess the patch_service was already inserted and we tried to insert it again? I'm going to comment out the DCHECK for now.

+		identity	{name_="patch_service" user_id_="505C0EE9-3013-43C0-82B0-A84F50CF8D84" instance_="" }	const service_manager::Identity &
-		instances_	{ size=15 }	std::map<service_manager::Identity,unsigned long,std::less<service_manager::Identity>,std::allocator<std::pair<const service_manager::Identity,unsigned long> > >
+		[comparator]	less	std::_Compressed_pair<std::less<service_manager::Identity>,std::_Compressed_pair<std::allocator<std::_Tree_node<std::pair<const service_manager::Identity,unsigned long>,void *> >,std::_Tree_val<std::_Tree_simple_types<std::pair<const service_manager::Identity,unsigned long> > >,1>,1>
+		[allocator]	allocator	std::_Compressed_pair<std::allocator<std::_Tree_node<std::pair<const service_manager::Identity,unsigned long>,void *> >,std::_Tree_val<std::_Tree_simple_types<std::pair<const service_manager::Identity,unsigned long> > >,1>
+		[{name_="audio" user_id_="505C0EE9-3013-43C0-82B0-A84F50CF8D84" instance_="" }]	52376	std::pair<const service_manager::Identity,unsigned long>
+		[{name_="catalog" user_id_="505C0EE9-3013-43C0-82B0-A84F50CF8D84" instance_="" }]	34772	std::pair<const service_manager::Identity,unsigned long>
+		[{name_="content_browser" user_id_="505C0EE9-3013-43C0-82B0-A84F50CF8D84" instance_="" }]	34772	std::pair<const service_manager::Identity,unsigned long>
+		[{name_="content_browser" user_id_="e19ff48e-cfcc-4b4f-9589-73e7942a0bf2" instance_="" }]	34772	std::pair<const service_manager::Identity,unsigned long>
+		[{name_="content_gpu" user_id_="505C0EE9-3013-43C0-82B0-A84F50CF8D84" instance_="2" }]	27704	std::pair<const service_manager::Identity,unsigned long>
+		[{name_="content_packaged_services" user_id_="205fc0d0-60d7-4772-a57d-3228a98adc2c" instance_="" }]	34772	std::pair<const service_manager::Identity,unsigned long>
+		[{name_="content_renderer" user_id_="e19ff48e-cfcc-4b4f-9589-73e7942a0bf2" instance_="11_1" }]	43564	std::pair<const service_manager::Identity,unsigned long>
+		[{name_="content_renderer" user_id_="e19ff48e-cfcc-4b4f-9589-73e7942a0bf2" instance_="12_1" }]	19892	std::pair<const service_manager::Identity,unsigned long>
+		[{name_="content_renderer" user_id_="e19ff48e-cfcc-4b4f-9589-73e7942a0bf2" instance_="13_1" }]	52808	std::pair<const service_manager::Identity,unsigned long>
+		[{name_="content_utility" user_id_="505C0EE9-3013-43C0-82B0-A84F50CF8D84" instance_="22" }]	19572	std::pair<const service_manager::Identity,unsigned long>
+		[{name_="content_utility" user_id_="505C0EE9-3013-43C0-82B0-A84F50CF8D84" instance_="5" }]	9088	std::pair<const service_manager::Identity,unsigned long>
+		[{name_="content_utility" user_id_="505C0EE9-3013-43C0-82B0-A84F50CF8D84" instance_="9" }]	52376	std::pair<const service_manager::Identity,unsigned long>
+		[{name_="patch_service" user_id_="505C0EE9-3013-43C0-82B0-A84F50CF8D84" instance_="" }]	19572	std::pair<const service_manager::Identity,unsigned long>
+		[{name_="proxy_resolver" user_id_="505C0EE9-3013-43C0-82B0-A84F50CF8D84" instance_="" }]	9088	std::pair<const service_manager::Identity,unsigned long>
+		[{name_="service_manager" user_id_="505C0EE9-3013-43C0-82B0-A84F50CF8D84" instance_="" }]	34772	std::pair<const service_manager::Identity,unsigned long>
+		[Raw View]	{...}	std::map<service_manager::Identity,unsigned long,std::less<service_manager::Identity>,std::allocator<std::pair<const service_manager::Identity,unsigned long> > >
+		it_and_inserted	(({name_="patch_service" user_id_="505C0EE9-3013-43C0-82B0-A84F50CF8D84" instance_="" }, 19572), false)	std::pair<std::_Tree_iterator<std::_Tree_val<std::_Tree_simple_types<std::pair<const service_manager::Identity,unsigned long> > > >,bool>
		it_and_inserted.second	false	bool
		pid	9784	unsigned int

Labels: Target-71
More specifically, the failure happened because an item with:

identity: {name_="patch_service" user_id_="505C0EE9-3013-43C0-82B0-A84F50CF8D84" instance_="" }

was inserted when it already existed in the map (duplicate names are fine, duplicate identities are not).

Is this useful information? Is there additional data that I should try to grab?

As I said, I wasn't doing anything particularly unusual, just running a debug build and letting it sit for a few minutes. It would be great to get this fixed, both to fix the presumed correctness problems and to avoid having this DCHECK fire while investigating other bugs.

Cc: -lalitm@chromium.org roc...@chromium.org ssid@chromium.org
Owner: lalitm@chromium.org
+rockot@, is it expected that we get to see two services with the same identity? That seems quite unexpected to me. Is this a bug of the Service Manager callbacks?

Bruce, thanks for filing this and dumping the identity strings, should definitely help.

Moving over to Lalit who is more familiar with that code.
Yeah unfortunately I think the API is racy. If an instance $id dies or becomes unreachable by the service manager, and a new connection to the same $id is requested immediately after that, a new instance with $id will be created. It's then possible for a ServiceMangaerListener to receive the new instance's OnServicePIDReceived before it receives the old instance's OnServiceStopped. This same essential problem is covered by  bug 840127 .
This hit again, somebody pinged me because tryjobs were failing on [1].
The fact that the ServiceMangaerListener API is racy is really unfortunate.
For the moment I'm going to force-replace entries and remove that dcheck, but I guess at some point we'll pay this and some metrics will be skewed if something else will go wrong and undetected.


[1] https://logs.chromium.org/logs/chromium/buildbucket/cr-buildbucket.appspot.com/8932942268393419584/+/steps/browser_tests__with_patch_/0/stdout
Blockedon: 840127
Blocking: -840127
Project Member

Comment 13 by bugdroid1@chromium.org, Oct 11

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/5203a4a475d0eb4c10ddacd9728ca6803c07487f

commit 5203a4a475d0eb4c10ddacd9728ca6803c07487f
Author: Kyle Horimoto <khorimoto@google.com>
Date: Thu Oct 11 22:06:43 2018

ProcessMap: Work around racy service listener.

Previously, we would DCHECK() that the ProcessMap did not contain an
identity when initializing it. However, this led to some false-positives
which resulted in the map containing spurious entries.

This CL updates ProcessMap::OnInit() to work around this issue by
removing any existing entries for a given identity before trying to add
new ones.

Bug:  818593 ,  884066 
Change-Id: Ic69af1f6b622dc73dde8fb09ad659b887857e384
Reviewed-on: https://chromium-review.googlesource.com/c/1277808
Commit-Queue: Kyle Horimoto <khorimoto@chromium.org>
Reviewed-by: Primiano Tucci <primiano@chromium.org>
Cr-Commit-Position: refs/heads/master@{#598972}
[modify] https://crrev.com/5203a4a475d0eb4c10ddacd9728ca6803c07487f/services/resource_coordinator/memory_instrumentation/process_map.cc

Cc: lalitm@chromium.org hjd@chromium.org wfh@chromium.org erikc...@chromium.org
 Issue 893253  has been merged into this issue.
Cc: -roc...@chromium.org rockot@google.com
This is still happening. It just happened on a repo that is up-to-date to November 9th, a month after the presumed fix was landed. I think we should remove the DCHECK until we figure out why it is being flaky (and reinstate it as a CHECK, or not at all).

crrev.com/c/1334004 is the CL I just created.

Coincidentally, I landed https://crrev.com/606960 on Nov 9. This should actually fix the problem, because it ensures that no exactly identical value of Identity is ever reused.
Owner: rockot@google.com
I'll take this, but I suspect we can close it. Let's see what happens once the CHECK lands.
Oh, wait, I assumed your CL was changing the DCHECK to a CHECK. :)

Maybe we don't need to land it now? Can you try syncing?
Since https://crrev.com/606960 should actually fix it I will abandon my change and resurrect if if I see the DCHECK again. Thanks for the fix.

It is still mysterious that I was able to hit the DCHECK given the workaround landed with crrev.com/c/1277808. I can't explain that (possibly user error?) but all we can do is watch for future issues - or perhaps change the DCHECK to a CHECK.

Project Member

Comment 21 by bugdroid1@chromium.org, Nov 27

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/3bdf26ad8f19ee9b629e09eb04ecc95c870fe772

commit 3bdf26ad8f19ee9b629e09eb04ecc95c870fe772
Author: Ken Rockot <rockot@google.com>
Date: Tue Nov 27 18:10:27 2018

Fix raciness in AudioServiceListener

This fixes potential race conditions in AudioServiceListener resulting
from the fact that a service instance can become unreachable (and may
therefore be replaced with a new one) before it is officially designated
as "stopped" by the Service Manager, even in the case of singleton
services.

The ServiceManagerListener API is augmented to convey a service
instance's current running state via RunningServiceInfo. This is used
by AudioServiceListener to more accurately track the running audio
service instance.

The listener now tracks the Identity of the most recently created
audio service instance and responds to events accordingly.

This also updates the logic and comments in resource_coordinator's
ProcessMap to accurately reflect current guarantees of the
ServiceManagerListener API, since these have changed slightly.

NOTRY=true

Bug:  818593 , 907898 ,908319
Change-Id: If3e8a35aa90db72a34c8f5f2d458f0f24c3447eb
Reviewed-on: https://chromium-review.googlesource.com/c/1350649
Commit-Queue: Ken Rockot <rockot@google.com>
Reviewed-by: Olga Sharonova <olka@chromium.org>
Reviewed-by: Tom Sepez <tsepez@chromium.org>
Reviewed-by: oysteine <oysteine@chromium.org>
Cr-Commit-Position: refs/heads/master@{#611193}
[modify] https://crrev.com/3bdf26ad8f19ee9b629e09eb04ecc95c870fe772/content/browser/renderer_host/media/audio_service_listener.cc
[modify] https://crrev.com/3bdf26ad8f19ee9b629e09eb04ecc95c870fe772/content/browser/renderer_host/media/audio_service_listener.h
[modify] https://crrev.com/3bdf26ad8f19ee9b629e09eb04ecc95c870fe772/content/browser/renderer_host/media/audio_service_listener_unittest.cc
[modify] https://crrev.com/3bdf26ad8f19ee9b629e09eb04ecc95c870fe772/services/resource_coordinator/memory_instrumentation/process_map.cc
[modify] https://crrev.com/3bdf26ad8f19ee9b629e09eb04ecc95c870fe772/services/service_manager/public/mojom/service_manager.mojom
[modify] https://crrev.com/3bdf26ad8f19ee9b629e09eb04ecc95c870fe772/services/service_manager/service_manager.cc
[modify] https://crrev.com/3bdf26ad8f19ee9b629e09eb04ecc95c870fe772/services/service_manager/service_manager.h

Status: Fixed (was: Assigned)

Sign in to add a comment