New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 818388 link

Starred by 4 users

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Mar 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 1
Type: Bug-Regression

Blocked on:
issue 808526

Blocking:
issue 818832
issue 818852



Sign in to add a comment

Win10 Debug (NVIDIA) can't start Chrome's GPU process (apparently)

Project Member Reported by kainino@chromium.org, Mar 3 2018

Issue description

On Win10 Debug (NVIDIA) - vm91-m1 - starting with this build:
https://ci.chromium.org/buildbot/chromium.gpu/Win10%20Debug%20(NVIDIA)/719
there are a crashes in GPU process startup. It seems there's something wrong with the state of this bot.

Some fail with:

[6436:8188:0302/144606.167:FATAL:compositor_util.cc(59)] Check failed: manager->IsGpuFeatureInfoAvailable(). 

Others seem to say this instead:

[1552:1144:0302/144640.268:ERROR:gpu_process_host.cc(465)] !GpuDataManagerImpl::GpuAccessAllowed()
[1552:1144:0302/144640.462:ERROR:gpu_process_host.cc(465)] !GpuDataManagerImpl::GpuAccessAllowed()
[1552:1144:0302/144640.462:ERROR:browser_gpu_channel_host_factory.cc(119)] Failed to launch GPU process.

Perhaps the machine needs a reboot? Perhaps something bad has happened to the system or driver state.

+zmo because I think you added the IsGpuFeatureInfoAvailable DCHECK recently. Apparently it is not always true, in some weird case.
 
Labels: Sheriff-Chromium
vm91-m1 just launches the tests to chromium-swarm.   The shards that are failing are on a variety of hosts which points the cause at not the hosts themselves?

Comment 3 by zmo@chromium.org, Mar 5 2018

Let me build on my Windows bot and see if I can reproduce.

Comment 4 by zmo@chromium.org, Mar 5 2018

Cc: geoffl...@chromium.org jmad...@chromium.org iannucci@chromium.org
 Issue 818727  has been merged into this issue.

Comment 5 by kbr@chromium.org, Mar 5 2018

Summary: Win10 Debug (NVIDIA) can't start Chrome's GPU process (apparently) (was: vm91-m1 can't start Chrome's GPU process (apparently))
Peter's right; this VM only triggers jobs on Swarming. All of the Swarming bots that are running the tasks from Win10 Debug (NVIDIA) are failing to launch the GPU process.

Currently comparing this run:

https://ci.chromium.org/buildbot/chromium.gpu/Win10%20Debug%20%28NVIDIA%29/718
and this working shard:
https://chromium-swarm.appspot.com/task?id=3c0193806f7e2a10&refresh=10&show_raw=1

to this failing run:

https://ci.chromium.org/buildbot/chromium.gpu/Win10%20Debug%20%28NVIDIA%29/719
and this failing shard:
https://chromium-swarm.appspot.com/task?id=3c01b4c777793410&refresh=10&show_raw=1

Haven't yet found any suspicious blamelists in the jobs around this one.

Comment 6 by kbr@chromium.org, Mar 5 2018

Components: -Infra>Labs Internals>GPU>Testing
Labels: Hotlist-PixelWrangler

Comment 8 by kbr@chromium.org, Mar 5 2018

Note that the FYI version of this bot is similarly failing:
https://ci.chromium.org/buildbot/chromium.gpu.fyi/Win10%20FYI%20Debug%20%28NVIDIA%29/?limit=200

It may not mean anything but the passing swarming slave in #5 is an older r210 which is also reporting the onboard matrox (which is disabled in the bios).
 
The failing slave is a newer r230 which has the onboard matrox fully disabled by the bios (i.e the os does not see it at all)
Cc: kylec...@chromium.org

Comment 11 by zmo@chromium.org, Mar 5 2018

On my local machine,

FATAL:process_map.cc(60): Check failed: it_and_inserted.second

Not sure it's related to this bug or it's my local build gets corrupted.

Comment 12 by kbr@chromium.org, Mar 5 2018

Labels: -Restrict-View-Google

Comment 13 by kbr@chromium.org, Mar 5 2018

Owner: zmo@chromium.org
Thanks Mo for doing the build – can you continue to dig to figure out what's going on?

Comment 14 by zmo@chromium.org, Mar 5 2018

To be specific, not just GPU process, renderer process also failed to launch.

I am manually bisecting right now.

Comment 15 by kbr@chromium.org, Mar 5 2018

Blocking: 818832

Comment 16 by kbr@chromium.org, Mar 5 2018

Cc: reillyg@chromium.org wfh@chromium.org jam@chromium.org

Comment 17 by wfh@chromium.org, Mar 5 2018

Cc: robliao@chromium.org elawrence@chromium.org
a few people in windows chat are also working on this, it's to do with exported functions it seems?

"2410:4690 @ 1057360531 - LdrpAllocateTls - INFO: TlsVector 1DD6C1F8 Index 14 : 520 bytes copied from 68266000 to 1E027248
9378:8f04 @ 1057360531 - LdrpNameToOrdinal - WARNING: Procedure "?GetSystemScaleFactor@ScreenWin@win@display@@SAMXZ" could not be located in DLL at base 0x4C940000.
2410:4690 @ 1057360531 - LdrpAllocateTls - INFO: TlsVector 1DD6C1F8 Index 15 : 2 bytes copied from 72572000 to 1DCD7FC0
9378:8f04 @ 1057360531 - LdrpReportError - ERROR: Locating export "?GetSystemScaleFactor@ScreenWin@win@display@@SAMXZ" for DLL "C:\src\c\src\out\default\ui_base.dll" failed with status: 0xc0000139."
Re #17: Using a non-component build resolved my problem.

Comment 19 by jam@chromium.org, Mar 5 2018

where is the chat happening, i.e. what's the room name?

I'm also finding that win7 isn't starting, I'm getting an error about API-MS-WIN-POWER-BASE-L1-1-0.DLL
https://chromium.googlesource.com/chromium/src/+/dc478ef126031896134faa8fc988d5f6cc5d87b7 is believed to be the source of #17, which may or may not be the source of the original bug.

Comment 22 by wfh@chromium.org, Mar 5 2018

if #20 is correct, then https://chromium-review.googlesource.com/653581 is the only CL that vaguely touches the area in #17

Comment 23 by zmo@chromium.org, Mar 5 2018

I just filed crbug.com/818852. Blamelists are skipping CLs.
AIUI, the issue with the CL cited in #21 is that we have a name collision between chrome's Display.dll (used in component builds) and the system's Display.dll in System32. The new mitigation policy forces us to load the system32 version, which lacks the APIs we use.

Comment 25 by zmo@chromium.org, Mar 5 2018

In the skipped blamelist, there is a Win sandbox related CL:

crrev.com/540634

Looks a very likely candidate

Comment 26 by zmo@chromium.org, Mar 5 2018

Owner: penny...@chromium.org
Status: Assigned (was: Untriaged)
Confirmed. It is crrev.com/540634.

Comment 27 by kbr@chromium.org, Mar 5 2018

Blocking: 818852
Status: Started (was: Assigned)

Comment 29 by wfh@chromium.org, Mar 5 2018

Can we increase trybot coverage to build/test this configuration? I think it was just a component build that broke? We can add extra trybots in presubmit for sandbox changes. Is 'win_optional_gpu_tests_rel' the right bot?

Comment 30 by zmo@chromium.org, Mar 6 2018

It's a DCHECK failure and win_optional_gpu_tests_rel doesn't have dcheck_always_on any more.

Comment 31 by kbr@chromium.org, Mar 6 2018

Actually win_optional_gpu_tests_rel and the other trybots do set dcheck_always_on; it's just the waterfall bots which do not set it any more. This can be seen in https://cs.chromium.org/chromium/src/tools/mb/mb_config.pyl .

We don't have capacity on the GPU bots to add a new CQ bot for the component build. Right now Chromium's testing strategy for the component build is that it's tested on the waterfall Debug bots. (See mb_config.pyl and how the various waterfall and trybots are configured. If we want to switch win7_chromium_rel_ng to test the component build rather than the statically linked build, that's fine with me though probably not fine with others. Not sure what tests win_chromium_dbg_ng runs?

Basically – if one of the non-GPU Win Chromium trybot configurations can be switched to test the component build, I would hope that at least one of the regular, non-GPU, test harnesses would have caught this regression.

For the record, this was a debug, component-build, run-time-gpu issue.

A CQ bot that builds this config, and runs any test that fires up a Chromium with GPU child process would be required.  Oh, and it would need to be a bot that is kept up to date with latest Win10 (as new security mitigations are usually in the latest update).

Fix close to landing: 
https://chromium-review.googlesource.com/c/chromium/src/+/950128
Project Member

Comment 33 by bugdroid1@chromium.org, Mar 6 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/fc897c242fac290704e0f3e9297ce97012e849f0

commit fc897c242fac290704e0f3e9297ce97012e849f0
Author: Penny MacNeil <pennymac@chromium.org>
Date: Tue Mar 06 01:32:20 2018

[Windows Sandbox] Disable PreferSys32 mitigation for now.

Breaking debug component build, gpu process.  There is a component DLL called
display.dll that is clashing with the same name in system32.

Temporary disable for the moment.

(Note: also removing the other IMAGE_LOAD mitigations from the browser
process (post-startup) for now.  Only set in child processes.)

BUG= 818388 
TEST=sbox_integration_tests.exe, ProcessMitigationsTest.*

Change-Id: I1198e2222493a792df675d4b7675220d0da3459f
Reviewed-on: https://chromium-review.googlesource.com/950128
Reviewed-by: Will Harris <wfh@chromium.org>
Commit-Queue: Penny MacNeil <pennymac@chromium.org>
Cr-Commit-Position: refs/heads/master@{#541021}
[modify] https://crrev.com/fc897c242fac290704e0f3e9297ce97012e849f0/content/app/sandbox_helper_win.cc
[modify] https://crrev.com/fc897c242fac290704e0f3e9297ce97012e849f0/services/service_manager/sandbox/win/sandbox_win.cc

Status: Fixed (was: Started)

Comment 35 by kbr@chromium.org, Mar 6 2018

Blockedon: 808526

Sign in to add a comment