New issue
Advanced search Search tips

Issue 856398 link

Starred by 1 user

Issue metadata

Status: Duplicate
Merged: issue 852796
Owner: ----
Closed: Jun 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 1
Type: Bug

Blocked on:
issue 810437



Sign in to add a comment

GPU process sometimes fails to exit running webkit_layout_tests on win7_chromium_rel_ng

Project Member Reported by kbr@chromium.org, Jun 26 2018

Issue description

Occasionally the GPU process that is launched by content_shell.exe while running webkit_layout_tests will fail to terminate cleanly, preventing Swarming from deleting the temporary directory and failing the entire shard. This causes an un-actionable CQ failure which causes the entire run to be retried.

Here's one example:
https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win7_chromium_rel_ng/24116

Specifically, this shard:
https://chromium-swarm.appspot.com/task?id=3e518c9492128210&refresh=10&show_raw=1

Here's a log excerpt:

14:10:23.253 1680 Testing completed. Exit status: 0
Failed to delete e:\b\s\w\ir (4 files remaining).
  Maybe the test has a subprocess outliving it.
  Sleeping 2 seconds.
Failed to delete e:\b\s\w\ir (4 files remaining).
  Maybe the test has a subprocess outliving it.
  Sleeping 4 seconds.
Failed to delete e:\b\s\w\ir. The following files remain:
- \\?\e:\b\s\w\ir
- \\?\e:\b\s\w\ir\out
- \\?\e:\b\s\w\ir\out\Release
- \\?\e:\b\s\w\ir\out\Release\content_shell.exe
Enumerating processes:
- pid 4248; Handles: 2; Exe: None; Cmd: "e:\b\s\w\ir\out\Release\content_shell.exe" --type=gpu-process --field-trial-handle=812,3443643823849636651,14475209876073705687,131072 --enable-features=OutOfBlinkCORS --disable-gpu-sandbox --disable-gpu-rasterization --disable-skia-runtime-opts --enable-logging --run-web-tests --enable-crash-reporter --crash-dumps-dir="e:\b\s\w\ir\out\Release\crash-dumps\reports" --register-font-files="e:\b\s\w\ir\out\Release\/test_fonts/Ahem.ttf" --gpu-preferences=KAAAAAAAAACAAwBgAQAAAAAAAAAAAGAAEAAAAAAAAAAAAAAAAAAAACgAAAAEAAAAIAAAAAAAAAAoAAAAAAAAADAAAAAAAAAAOAAAAAAAAAAQAAAAAAAAAAAAAAAKAAAAEAAAAAAAAAAAAAAACwAAABAAAAAAAAAAAQAAAAoAAAAQAAAAAAAAAAEAAAALAAAA --use-gl=swiftshader --run-web-tests --enable-crash-reporter --crash-dumps-dir="e:\b\s\w\ir\out\Release\crash-dumps\reports" --register-font-files="e:\b\s\w\ir\out\Release\/test_fonts/Ahem.ttf" --enable-logging --service-request-channel-token=6229271957777897514 --mojo-platform-channel-handle=1920 /prefetch:2

There's no good reason for the GPU process to fail to terminate when the overall harness does.

Not sure but could this behavior be related to the switch to Swiftshader as the software renderer? Could one of Swiftshader's worker threads be processing some bad draw call sent by an earlier layout test, thereby preventing the GPU process from exiting?

Is there more instrumentation that could be added to figure out why this is happening?

I don't know what percentage of CQ jobs are affected by this particular failure mode, but right now I'm seeing 80% of my own runs on win7_chromium_rel_ng fail with layout test flakes so am marking this as P1.

 

Comment 1 by capn@chromium.org, Jun 26 2018

How would this be reproduced? I have an old WIP patch which may prevent infinite loops in vertex processing: https://swiftshader-review.googlesource.com/4893

Comment 2 by piman@chromium.org, Jun 26 2018

Is this the same as  bug 852796 ?

Comment 3 by piman@chromium.org, Jun 26 2018

Cc: magchen@chromium.org
A new command line option -disable-gpu-process-for-dx12-vulkan-info-collection will be created to skip this gpu process.

Comment 5 by kbr@chromium.org, Jun 26 2018

Mergedinto: 852796
Status: Duplicate (was: Untriaged)
Yes, sorry, this is probably the same issue as  Issue 852796 .

Project Member

Comment 6 by bugdroid1@chromium.org, Jun 27 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/c92718e66ebccd305fba542190cc242be5b7f44b

commit c92718e66ebccd305fba542190cc242be5b7f44b
Author: Maggie Chen <magchen@chromium.org>
Date: Wed Jun 27 02:25:17 2018

Create a new command line option --disable-gpu-process-for-dx12-vulkan-info-collection

This new command line option is created to disable the non-sandboxed
gpu process for collecting DX12/Vulkan information. Although this process only
exists for a very short period of time, it can sometimes interfere with the
layout test or the performance tests. With this option, those tests can run
without any interfering.

BUG= 852796 , 856398 

Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel
Change-Id: I43582a15cd2451da9081111b79adec396c9acd4f
Reviewed-on: https://chromium-review.googlesource.com/1112463
Commit-Queue: Maggie Chen <magchen@chromium.org>
Reviewed-by: Antoine Labour <piman@chromium.org>
Cr-Commit-Position: refs/heads/master@{#570636}
[modify] https://crrev.com/c92718e66ebccd305fba542190cc242be5b7f44b/content/browser/browser_main_loop.cc
[modify] https://crrev.com/c92718e66ebccd305fba542190cc242be5b7f44b/content/shell/browser/layout_test/layout_test_content_browser_client.cc
[modify] https://crrev.com/c92718e66ebccd305fba542190cc242be5b7f44b/gpu/config/gpu_switches.cc
[modify] https://crrev.com/c92718e66ebccd305fba542190cc242be5b7f44b/gpu/config/gpu_switches.h

Project Member

Comment 7 by bugdroid1@chromium.org, Jul 9

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/f58fd11a72fd8ab08da30a0666be3cdf2efb8e7a

commit f58fd11a72fd8ab08da30a0666be3cdf2efb8e7a
Author: Maggie Chen <magchen@chromium.org>
Date: Mon Jul 09 22:59:36 2018

Disable non-sandboxed gpu process for layout and browser tests

The comman line switch --disable-gpu-process-for-dx12-vulkan-info-collection
is added to layout tests and browser test so the non-sandboxed gpu process,
which is used for DX12 and Vulkan info collection and histograms, can be
disabled to avoid interference. The info collection process is not needed
for these tests.

BUG= 852796 , 856398 

Change-Id: I7fac317987996aeb4a5ea37e473468f69ef3d89a
Reviewed-on: https://chromium-review.googlesource.com/1123167
Reviewed-by: Zhenyao Mo <zmo@chromium.org>
Reviewed-by: Antoine Labour <piman@chromium.org>
Commit-Queue: Maggie Chen <magchen@chromium.org>
Cr-Commit-Position: refs/heads/master@{#573501}
[modify] https://crrev.com/f58fd11a72fd8ab08da30a0666be3cdf2efb8e7a/content/public/test/browser_test_base.cc
[modify] https://crrev.com/f58fd11a72fd8ab08da30a0666be3cdf2efb8e7a/content/public/test/test_launcher.cc
[modify] https://crrev.com/f58fd11a72fd8ab08da30a0666be3cdf2efb8e7a/content/shell/app/shell_main_delegate.cc
[modify] https://crrev.com/f58fd11a72fd8ab08da30a0666be3cdf2efb8e7a/content/shell/browser/layout_test/layout_test_browser_main.cc
[modify] https://crrev.com/f58fd11a72fd8ab08da30a0666be3cdf2efb8e7a/content/shell/browser/layout_test/layout_test_content_browser_client.cc

Sign in to add a comment