New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 757946 link

Starred by 5 users

Issue metadata

Status: Fixed
Owner:
Closed: Dec 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 1
Type: Bug-Regression

Blocking:
issue 596231



Sign in to add a comment

NOTREACHED in GlobalActivityTracker::RecordProcessLaunch

Project Member Reported by siggi@chromium.org, Aug 22 2017

Issue description

This is hitting in my local dcheck_always_on build:

void GlobalActivityTracker::RecordProcessLaunch(
    ProcessId process_id,
    const FilePath::StringType& cmd) {
  const int64_t pid = process_id;
  DCHECK_NE(GetProcessId(), pid);
  DCHECK_NE(0, pid);

  base::AutoLock lock(global_tracker_lock_);
  if (base::ContainsKey(known_processes_, pid)) {
    // TODO(bcwhite): Measure this in UMA.
    NOTREACHED() << "Process #" << process_id   <<< HERE
                 << " was previously recorded as \"launched\""
                 << " with no corresponding exit.";
    known_processes_.erase(pid);
  }

#if defined(OS_WIN)
  known_processes_.insert(std::make_pair(pid, UTF16ToUTF8(cmd)));
#else
  known_processes_.insert(std::make_pair(pid, cmd));
#endif
}

 

Comment 1 by siggi@chromium.org, Aug 22 2017

Blocking: 596231
What type of process was being launched that caused this CHECK?
Cc: siggi@chromium.org etienneb@chromium.org erikc...@chromium.org
 Issue 756149  has been merged into this issue.

Comment 4 by siggi@chromium.org, Aug 23 2017

On deliberation, what might be happening here is that either events are being lost, or not synchronized. PIDs in Windows are aggressively reused, so it's generally unwise to key on PID alone. {PID|CreationTime} does however uniquely identify a process instance.

Comment 5 by siggi@chromium.org, Aug 23 2017

This hasn't reproduced for my DCHECK build in a couple of hours of usage, don't think it need block shipping DCHECK.

Comment 6 by siggi@chromium.org, Aug 24 2017

Labels: Hotlist-dcheck

Comment 7 by siggi@chromium.org, Oct 17 2017

This hits me every couple of days or so. See e.g. crash/e7582a26b9c65088.
Do you have a way to reproduce it at will?

Comment 9 by siggi@chromium.org, Oct 17 2017

Nopes, the check just hits every other day or so. You can repro locally with go/syzyasan-optin, then set chrome://flags/#dcheck-is-fatal to "Enabled" and run under "windbg -g -G -o".
That'll drop you into a debugger when the DCHECK hits.
Owner: bcwh...@chromium.org
Status: Started (was: Untriaged)
Cc: khushals...@chromium.org
 Issue 779212  has been merged into this issue.
Project Member

Comment 12 by bugdroid1@chromium.org, Oct 30 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e7f9a9155ea03d25da4e5269c9532fce9d5f77ac

commit e7f9a9155ea03d25da4e5269c9532fce9d5f77ac
Author: Brian White <bcwhite@chromium.org>
Date: Mon Oct 30 14:53:47 2017

Include command line in error message.

In order to find the process that isn't being cleaned up, it's
necessary to know what the process is.

Bug:  757946 
Change-Id: I4c7d8cf139451fdfc194d6aa02d28ad8799dddc8
Reviewed-on: https://chromium-review.googlesource.com/726599
Reviewed-by: Sigurður Ásgeirsson <siggi@chromium.org>
Commit-Queue: Brian White <bcwhite@chromium.org>
Cr-Commit-Position: refs/heads/master@{#512493}
[modify] https://crrev.com/e7f9a9155ea03d25da4e5269c9532fce9d5f77ac/base/debug/activity_tracker.cc

Comment 13 by kbr@chromium.org, Oct 30 2017

Another failure:
https://ci.chromium.org/buildbot/chromium.gpu.fyi/Win7%20Experimental%20Release%20%28NVIDIA%29/260

The test:
WebglConformance_conformance2_textures_video_tex_3d_rgb565_rgb_unsigned_short_5_6_5

from the webgl2_conformance_tests suite failed. Stack trace:

  ********************************************************************************
  	Last event: 5c4.5a4: Break instruction exception - code 80000003 (first/second chance not available)
  	  debugger time: Mon Oct 30 08:15:17.764 2017 (UTC - 7:00)
  	ChildEBP RetAddr  Args to Child              
  	06cde6ac 6b50d217 6d2da4fd 00000587 0c261e94 chrome!base::debug::BreakDebugger+0xc
  	06cde6d4 6b43ce9e 0294fc88 06cde720 06cde724 chrome!?Run@?$Invoker@U?$BindState@P6AXPBDHV?$BasicStringPiece@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@base@@1@Z$$V@internal@base@@$$A6AXPBDHV?$BasicStringPiece@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@3@1@Z@internal@base@@SAXPAVBindStateBase@23@$$QAPBD$$QAH$$QAV?$BasicStringPiece@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@3@3@Z+0x25
  	06cdeb68 6b4cdbe4 02610358 0297ffc8 000003b0 chrome!logging::LogMessage::~LogMessage+0x41e
  	06cdec70 6b49cbbc 00000978 06cdee04 084db101 chrome!base::debug::GlobalActivityTracker::RecordProcessLaunch+0x2f4
  	06cdedf0 6b49c57d 06cdeed8 06cdee04 06cdeef0 chrome!base::LaunchProcess+0x5ec
  	06cdee2c 6c06ba71 06cdeed8 08687820 06cdeef0 chrome!base::LaunchProcess+0x2d
  	06cdf10c 6aa9d5bd 08687820 06cdf180 06cdf1d4 chrome!service_manager::SandboxWin::StartSandboxedProcess+0x10b
  	06cdf1b0 6ad977bf 086d7750 08687820 06cdf1d4 chrome!content::StartSandboxedProcess+0x14c
  	06cdf2bc 6ad96ada 06cdf2f0 06cdf308 00000000 chrome!content::internal::ChildProcessLauncherHelper::LaunchProcessOnLauncherThread+0x19b
  	06cdf3cc 6b4fdf95 07114f68 00310fb3 6b44080d chrome!content::internal::ChildProcessLauncherHelper::LaunchOnLauncherThread+0x102
  	06cdf4f0 6b4f8e19 6d2e0b07 0c21bc00 01010170 chrome!base::debug::TaskAnnotator::RunTask+0xe5
  	06cdf62c 6b4f844b 0c21bc00 02666550 00000001 chrome!base::internal::TaskTracker::RunOrSkipTask+0x279
  	06cdf72c 6b507a93 06cdf758 02666550 02666500 chrome!base::internal::TaskTracker::RunNextTask+0x12b
  	06cdf820 6b458cd3 02692c60 0000042c 0000042c chrome!base::internal::SchedulerWorker::Thread::ThreadMain+0x263
  	*** WARNING: Unable to verify checksum for kernel32.dll
  	*** ERROR: Symbol file could not be found.  Defaulted to export symbols for kernel32.dll - 
  	06cdf844 771b338a 026933c8 06cdf890 77849902 chrome!base::PlatformThread::GetCurrentThreadPriority+0x1d3
  	WARNING: Stack unwind information not available. Following frames may be wrong.
  	06cdf850 77849902 026933c8 7008d84c 00000000 kernel32!BaseThreadInitThunk+0x12
  	06cdf890 778498d5 6b458c30 026933c8 ffffffff ntdll!RtlInitializeExceptionChain+0x63
  	06cdf8a8 00000000 6b458c30 026933c8 00000000 ntdll!RtlInitializeExceptionChain+0x36
  	
It's only an intermittent failure but is affecting random tests. To run these, build the telemetry_gpu_integration_test target (which has Chrome as a dependency) and run:

./content/test/gpu/run_gpu_integration_test.py webgl_conformance --browser=release --webgl-conformance-version=2.0.1 --test-filter=conformance2_textures_video_tex_3d_rgb565_rgb_unsigned_short_5_6_5

Alternatively, use --browser=debug if you have a debug build.

It's unlikely to fail with just that one test. Perhaps use conformance2_textures_video as the filter, or conformance2_textures.

Comment 14 by kbr@chromium.org, Nov 1 2017

Cc: jmad...@chromium.org zmo@chromium.org ynovikov@chromium.org cwallez@chromium.org kbr@chromium.org
 Issue 780465  has been merged into this issue.

Comment 15 by kbr@chromium.org, Nov 1 2017

Labels: -Type-Bug -Pri-2 Pri-1 Type-Bug-Regression
This is now affecting the ANGLE project's commit queue, causing random test flakiness and preventing CLs from landing. See  Issue 780465 . Recent failures:

https://build.chromium.org/p/tryserver.chromium.angle/builders/win_angle_rel_ng/builds/7541

https://chromium-swarm.appspot.com/task?id=39908a6ee4dd5810&refresh=10&show_raw=1

https://build.chromium.org/p/tryserver.chromium.angle/builders/win_angle_rel_ng/builds/7538

https://chromium-swarm.appspot.com/task?id=399063b9cf126110&refresh=10&show_raw=1

https://build.chromium.org/p/tryserver.chromium.angle/builders/win_angle_rel_ng/builds/7536

https://chromium-swarm.appspot.com/task?id=39904a4522527810&refresh=10&show_raw=1

There is no way we can work around this. Upgrading this bug to P1. bcwhite@, please do something to make this stop crashing, whether it is downgrading the NOTREACHED to something else, improving the tracking, etc.
Full error info:

[1156:908:1101/081036.329:FATAL:activity_tracker.cc(1415)] Check failed: false. Process #2556 was previously recorded as "launched" with no corresponding exit.
"c:\b\swarm_slave\w\ir\out\Release\chrome.exe" --type=gpu-process --field-trial-handle=1300,18379742025283163801,14681353244698629401,131072 --disable-gpu-sandbox --enable-logging=stderr --use-cmd-decoder=validating --noerrdialogs --user-data-dir="c:\b\swarm_slave\w\itsx5riw\tmpr9wmxh" --gpu-preferences=GAAAAAAAAAAQBwAAAQAAAAAAAAAAAGAA --gpu-vendor-id=0x1002 --gpu-device-id=0x6613 --gpu-driver-vendor="Advanced Micro Devices, Inc." --gpu-driver-version=21.19.137.1 --gpu-driver-date=9-16-2016 --gpu-secondary-vendor-ids=0x102b --gpu-secondary-device-ids=0x0534 --noerrdialogs --user-data-dir="c:\b\swarm_slave\w\itsx5riw\tmpr9wmxh" --enable-logging=stderr --service-request-channel-token=F8113125735ACD9F00D8540775107014 --mojo-platform-channel-handle=2472 /prefetch:2
Backtrace:
	base::debug::StackTrace::StackTrace [0x66B93350+32]
	base::debug::StackTrace::StackTrace [0x66B6DD0D+13]
	logging::LogMessage::~LogMessage [0x66B05BBE+78]
	base::debug::GlobalActivityTracker::RecordProcessLaunch [0x66B97274+884]
	base::LaunchProcess [0x66B6581C+1516]
	base::LaunchProcess [0x66B651DD+45]
	service_manager::SandboxWin::StartSandboxedProcess [0x677465E9+267]
	content::StartSandboxedProcess [0x6615D939+332]
	content::internal::ChildProcessLauncherHelper::LaunchProcessOnLauncherThread [0x66459B7F+411]
	content::internal::ChildProcessLauncherHelper::LaunchOnLauncherThread [0x66458E36+258]
	base::debug::TaskAnnotator::RunTask [0x66BC7745+229]
	base::internal::TaskTracker::RunOrSkipTask [0x66BC2758+664]
	base::internal::TaskTracker::RunNextTask [0x66BC1D6B+299]
	base::internal::SchedulerWorker::Thread::ThreadMain [0x66BD1353+611]
	base::PlatformThread::GetCurrentThreadPriority [0x66B21F83+467]
	BaseThreadInitThunk [0x75FA337A+18]
	RtlInitializeExceptionChain [0x775892B2+99]
	RtlInitializeExceptionChain [0x77589285+54]


Comment 17 by kbr@chromium.org, Nov 2 2017

The flakiness induced by this assertion failure is causing CQ jobs to be retried multiple times, causing excessive load on the tryservers. This was part of the reason for a machine outage today. See Issue 780662 and Issue 780665.

I can't stress how important a fix is for this. Please acknowledge that this is being worked on.

Pretty sure I've got it nailed down:
https://chromium-review.googlesource.com/c/chromium/src/+/750281
Project Member

Comment 19 by bugdroid1@chromium.org, Nov 2 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/ae2a8b9a9455eede005c731e053e22ab98a0f6d7

commit ae2a8b9a9455eede005c731e053e22ab98a0f6d7
Author: Brian White <bcwhite@chromium.org>
Date: Thu Nov 02 19:10:36 2017

Record browser process exits in tracker.

The tracker needs to be notified when a process exits but the Process
object doesn't do that (and can't do that) on its own so it needs to
be called where the termination status is fetched.

It still makes sense to pipe this through the Process object just to
keep things in one place and make it easier to be called from other
places where necessary.

Bug:  757946 
Change-Id: I9e039bcd143277461e6b8323f1756e03a46f005b
Reviewed-on: https://chromium-review.googlesource.com/750281
Reviewed-by: Mark Mentovai <mark@chromium.org>
Reviewed-by: Antoine Labour <piman@chromium.org>
Commit-Queue: Jamie Madill <jmadill@chromium.org>
Cr-Commit-Position: refs/heads/master@{#513577}
[modify] https://crrev.com/ae2a8b9a9455eede005c731e053e22ab98a0f6d7/base/process/process.h
[modify] https://crrev.com/ae2a8b9a9455eede005c731e053e22ab98a0f6d7/base/process/process_fuchsia.cc
[modify] https://crrev.com/ae2a8b9a9455eede005c731e053e22ab98a0f6d7/base/process/process_posix.cc
[modify] https://crrev.com/ae2a8b9a9455eede005c731e053e22ab98a0f6d7/base/process/process_win.cc
[modify] https://crrev.com/ae2a8b9a9455eede005c731e053e22ab98a0f6d7/content/browser/child_process_launcher.cc

Status: Fixed (was: Started)
Big thanks Brian for fixing this!

Comment 22 by w...@chromium.org, Nov 8 2017

Status: Assigned (was: Fixed)
FYI, just observed this in Albatross build 64.0.3261.1, crash id f740e36fb21b4cfb - I think the CL from comment #19 should be present already in this build?

Might this be related to the issue of PID = 0u processes sometimes cropping up in Task Manager?

Comment 23 by w...@chromium.org, Nov 8 2017

And another instance (crash id abfd38180cd8ae9a) - this one was launching a NativeMessaging child process, FWIW.
Both of these are the gnubbyagent process not being cleaned up.  I'll take a look.
Project Member

Comment 25 by bugdroid1@chromium.org, Nov 9 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/d8cc0afb39ed0c1bf7e701d800c11d7a5d29d820

commit d8cc0afb39ed0c1bf7e701d800c11d7a5d29d820
Author: Brian White <bcwhite@chromium.org>
Date: Thu Nov 09 12:01:45 2017

Record process exit from EnsureProcessTerminated.

When a process is terminated, the tracker has to be notified so that
it can be cleaned up.

Bug:  757946 
Change-Id: I1b894893b3c15228923f88573aa19d46faaa33c0
Reviewed-on: https://chromium-review.googlesource.com/759088
Reviewed-by: Mark Mentovai <mark@chromium.org>
Commit-Queue: Brian White <bcwhite@chromium.org>
Cr-Commit-Position: refs/heads/master@{#515144}
[modify] https://crrev.com/d8cc0afb39ed0c1bf7e701d800c11d7a5d29d820/base/process/kill_posix.cc
[modify] https://crrev.com/d8cc0afb39ed0c1bf7e701d800c11d7a5d29d820/base/process/kill_win.cc

Status: Fixed (was: Assigned)
In case the cause of these is not clear...

Chrome has common code for handling the launch of new processes but has no common code for reaping them.  Thus, tracking code has to be added to the many places where this is done and it's easy to miss them.

Please reopen if another case arises.

Comment 27 by siggi@chromium.org, Nov 14 2017

Status: Assigned (was: Fixed)
crash/e1b6250cc508b5ec from 64.0.3267.1.
This one is a Renderer.  Renderer processes ARE reaped, though.  Hmmm...

It seems that a Renderer is created during startup, perhaps for the NTP, that dies without being reported.  Other renderers are seen to exit, though.  I wonder what's special about this first one.

Now if VC2017 would just stop crashing all the time, perhaps I could figure this out...
There's a race condition in Process::Terminate where if the process happens to exit before it can be terminated, the termination isn't recognized and the process is still believed (by that code) to be running and so never records the exit.
Project Member

Comment 30 by bugdroid1@chromium.org, Nov 14 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/ccb2c6e22528fb2926067b46c62eb2ee7e9af86e

commit ccb2c6e22528fb2926067b46c62eb2ee7e9af86e
Author: Brian White <bcwhite@chromium.org>
Date: Tue Nov 14 19:09:44 2017

Fix race condition in Process::Terminate.

Bug:  757946 
Change-Id: Ia7e92e2a44f9f17f5bfae10a43ea05ccde74c9eb
Reviewed-on: https://chromium-review.googlesource.com/768968
Commit-Queue: Brian White <bcwhite@chromium.org>
Reviewed-by: Mark Mentovai <mark@chromium.org>
Cr-Commit-Position: refs/heads/master@{#516374}
[modify] https://crrev.com/ccb2c6e22528fb2926067b46c62eb2ee7e9af86e/base/process/process_win.cc

Status: Fixed (was: Assigned)
Another one down.
Re-open if another one pops up.

Comment 32 by w...@chromium.org, Nov 16 2017

Status: Assigned (was: Fixed)
Report Id 37f0d220efded90f appears to be a renderer Pid being re-used; looks like that build 64.0.3269.1 should include the patch from #30.

Comment 33 by w...@chromium.org, Nov 16 2017

I also see report id 8c1a097ac8f909b6 in my chrome://crashes, for this signature. 

Comment 34 by w...@chromium.org, Nov 22 2017

Labels: M-64
More crash Ids: 0400d024e8cb955b and 39bf42c855e593c1, in Chrome 64.0.3274.1.

Comment 35 by w...@chromium.org, Nov 22 2017

Labels: -Hotlist-dcheck Hotlist-Albatross-Dcheck

Comment 36 by w...@chromium.org, Nov 27 2017

And another (Id 45bbaf5f52eb3489) - this and the ones from #34 were all missed renderer teardowns.
Project Member

Comment 37 by bugdroid1@chromium.org, Dec 1 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/6c8f3b9c3906cd4d9b96d264277d52b2c47e7564

commit 6c8f3b9c3906cd4d9b96d264277d52b2c47e7564
Author: Brian White <bcwhite@chromium.org>
Date: Fri Dec 01 18:55:31 2017

Add timeout waiting for exit of process.

Turns out that there are cases where a process has started exiting (and
thus can't be terminated) but still isn't ready to be reaped.  Thus, a
wait timeout is required in this case as well.

Bug:  757946 
Change-Id: Iaf79092a991200c14d321963d9a20bfc191867bc
Reviewed-on: https://chromium-review.googlesource.com/803837
Reviewed-by: Mark Mentovai <mark@chromium.org>
Commit-Queue: Brian White <bcwhite@chromium.org>
Cr-Commit-Position: refs/heads/master@{#521001}
[modify] https://crrev.com/6c8f3b9c3906cd4d9b96d264277d52b2c47e7564/base/process/process_win.cc

Another hit me: crash/b1f9ab898ef97a0f. This is SyzyASAN version 64.0.3282.1, unfortunately the SyzyASAN build is broken ATM.
Only M65 is invulnerable.
(It is "inconceivable" that any more crashes will occur with M65 or above. :-)

Labels: -Hotlist-Albatross-Dcheck Hotlist-Dcheck-Albatross
Project Member

Comment 42 by bugdroid1@chromium.org, Dec 14 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/c3bff5937cbf29ecc6ea51f189446bce4e2cff9d

commit c3bff5937cbf29ecc6ea51f189446bce4e2cff9d
Author: Brian White <bcwhite@chromium.org>
Date: Thu Dec 14 17:57:52 2017

Track process only if launch is successful.

If the process launch fails it ends up trying to track PID #0 which
causes a DCHECK.

Bug:  757946 
Change-Id: If4558b1707aad237273127365c5b5be64d32f327
Reviewed-on: https://chromium-review.googlesource.com/826085
Reviewed-by: Will Harris <wfh@chromium.org>
Reviewed-by: Mark Mentovai <mark@chromium.org>
Commit-Queue: Will Harris <wfh@chromium.org>
Commit-Queue: Brian White <bcwhite@chromium.org>
Cr-Commit-Position: refs/heads/master@{#524111}
[modify] https://crrev.com/c3bff5937cbf29ecc6ea51f189446bce4e2cff9d/services/service_manager/sandbox/win/sandbox_win.cc

Sign in to add a comment