New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 628823 link

Starred by 5 users

Issue metadata

Status: Fixed
Owner:
Email to this user bounced
Closed: Aug 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux
Pri: 1
Type: Bug



Sign in to add a comment

Occasional xcb failure running WebGL 2.0 conformance tests on Linux Intel

Project Member Reported by kbr@chromium.org, Jul 16 2016

Issue description

Example build:
https://build.chromium.org/p/chromium.gpu.fyi/builders/Linux%20Release%20%28New%20Intel%29/builds/2595

WebglConformance_conformance2_context_context_attributes_depth_stencil_antialias_obeyed failed:

WebglConformance_conformance2_context_context_attributes_depth_stencil_antialias_obeyed (gpu_tests.webgl_conformance_integration_test.WebGLConformanceIntegrationTest) ... [xcb] Unknown sequence number while processing queue
[xcb] Most likely this is a multi-threaded client and XInitThreads has not been called
[xcb] Aborting, sorry about that.
ed_runVCmPDT/out/Release/chrome --type=gpu-process --mojo-channel-token=D764384F388F9D0D6956C6143EE70C58 --mojo-application-channel-token=7FB3DA27ACDF11145096833FA28A24FA --enable-features=DocumentWriteEvaluator<DisallowFetchForDocWrittenScriptsInMainFrame,ExpectCTReporting<ExpectCTReporting,IncidentReportingDisableUpload<SafeBrowsingIncidentReportingService,IncidentReportingModuleLoadAnalysis<SafeBrowsingIncidentReportingServiceFeatures,IncidentReportingSuspiciousModuleReporting<SafeBrowsingIncidentReportingServiceFeatures,MainFrameBeforeActivation<MainFrameBeforeActivation,NetworkTimeServiceQuerying<NetworkTimeQueries,NewAudioRenderingMixingStrategy<NewAudioRenderingMixingStrategy,NonValidatingReloadOnNormalReload<NonValidatingReloadOnNormalReload,PointerEvent<PointerEvent,PreconnectMore<PreconnectMore,UsePasswordSeparatedSigninFlow<PasswordSeparatedSigninFlow,WebRTC-EnableWebRtcEcdsa<WebRTC-EnableWebRtcEcdsa,WebRTC-H264WithOpenH264FFmpeg<WebRTC-H264WithOpenH264FFmpeg,token-binding<TokenBinding,use-new-media-cache<use-new-media-cache --force-fieldtrials=AsyncDNS/AsyncDNSA/AutofillClassifier/Enabled/AutofillFieldMetadata/Enabled/AutofillProfileOrderByFrecency/EnabledLimitTo3/CaptivePortalInterstitial/Enabled/ChromeDashboard/Enabled/ChromotingQUIC/Enabled/*DisallowFetchForDocWrittenScriptsInMainFrame/DocumentWriteEvaluatorGroup/EnableGoogleCachedCopyTextExperiment/Button/*EnableMediaRouter/Enabled/EnableMediaRouterWithCastExtension/Enabled/EnableSessionCrashedBubbleUI/Enabled/ExpectCTReporting/ExpectCTReportingEnabled/ExtensionActionRedesign/Enabled/GoogleBrandedContextMenu/branded/InstanceID/Enabled/*MainFrameBeforeActivation/Enabled/MaterialDesignDownloads/Enabled/MojoChannel/Enabled/NetworkTimeQueries/NetworkTimeQueriesEnabled/NewAudioRenderingMixingStrategy/Enabled/NonValidatingReloadOnNormalReload/Enabled/OfferUploadCreditCards/Enabled/OutOfProcessPac/Enabled/*PageRevisitInstrumentation/Enabled/PasswordBranding/SmartLockBrandingSavePromptOnly/*PasswordManagerSettingsMigration/Enable/PasswordSeparatedSigninFlow/Enabled/*PointerEvent/Enabled/*PreconnectMore/Enabled/*QUIC/Enabled/RefreshTokenDeviceId/Enabled/ReportCertificateErrors/ShowAndPossiblySend/SSLCommonNameMismatchHa: ../../src/xcb_io.c:274: poll_for_event: Assertion `!xcb_xlib_threads_sequence_lost' failed.
Received signal 6
#0 0x55b53c6f9c77 base::debug::(anonymous namespace)::StackDumpSignalHandler()
#1 0x7fecdaf533d0 <unknown>
#2 0x7fecd4cdd418 gsignal
#3 0x7fecd4cdf01a abort
#4 0x7fecd4cd5bd7 <unknown>
#5 0x7fecd4cd5c82 __assert_fail
#6 0x7fecd9b68199 <unknown>
#7 0x7fecd9b6824b <unknown>
#8 0x7fecd9b6855d _XEventsQueued
#9 0x7fecd9b59f47 XPending
#10 0x55b53d089f8d ui::(anonymous namespace)::XSourcePrepare()
#11 0x7fecd9eae8ad g_main_context_prepare
#12 0x7fecd9eaf24b <unknown>
#13 0x7fecd9eaf42c g_main_context_iteration
#14 0x55b53c71b7e6 base::MessagePumpGlib::Run()
#15 0x55b53c719151 base::MessageLoop::RunHandler()
#16 0x55b53c73c290 base::RunLoop::Run()
#17 0x55b53c038da6 content::GpuMain()
#18 0x55b53c338b2b content::RunNamedProcessTypeMain()
#19 0x55b53c3395a3 content::ContentMainRunnerImpl::Run()
#20 0x55b53c337e70 content::ContentMain()
#21 0x55b53996d6bb ChromeMain
#22 0x7fecd4cc8830 __libc_start_main
#23 0x55b53996d58d <unknown>
  r8: fefefefefefefeff  r9: fefeff092d63646b r10: 0000000000000008 r11: 0000000000000202
 r12: 0000000000000112 r13: 00007fecd9bdade8 r14: 000000007fffffff r15: 000055b53d089f80
  di: 0000000000002385  si: 0000000000002385  bp: 00007fecd9bdaae0  bx: 00007fecc76ed000
  dx: 0000000000000006  ax: 0000000000000000  cx: 00007fecd4cdd418  sp: 00007ffeed9f88d8
  ip: 00007fecd4cdd418 efl: 0000000000000202 cgf: 0000000000000033 erf: 0000000000000000
 trp: 0000000000000000 msk: 0000000000000000 cr2: 0000000000000000
[end of stack trace]
[32457:32457:0715/171121:ERROR:gpu_process_transport_factory.cc(865)] Lost UI shared context.
[1:1:0715/171121:ERROR:context_provider_command_buffer.cc(191)] Failed to initialize GLES2Implementation.
[32457:32457:0715/171121:INFO:CONSOLE(11)] "Fail to create a context", source:  (11)
[32457:32457:0715/171121:INFO:CONSOLE(11)] "FAIL Fail to create a context", source:  (11)
[32457:32457:0715/171121:INFO:CONSOLE(0)] "WebGL: CONTEXT_LOST_WEBGL: loseContext: context lost", source: http://127.0.0.1:36754/conformance2/context/context-attributes-depth-stencil-antialias-obeyed.html?webglVersion=2 (0)
[32457:32457:0715/171121:INFO:CONSOLE(0)] "WebGL: CONTEXT_LOST_WEBGL: loseContext: context lost", source: http://127.0.0.1:36754/conformance2/context/context-attributes-depth-stencil-antialias-obeyed.html?webglVersion=2 (0)
[32457:32457:0715/171121:INFO:CONSOLE(0)] "WebGL: CONTEXT_LOST_WEBGL: loseContext: context lost", source: http://127.0.0.1:36754/conformance2/context/context-attributes-depth-stencil-antialias-obeyed.html?webglVersion=2 (0)
[32457:32457:0715/171121:INFO:CONSOLE(0)] "WebGL: CONTEXT_LOST_WEBGL: loseContext: context lost", source: http://127.0.0.1:36754/conformance2/context/context-attributes-depth-stencil-antialias-obeyed.html?webglVersion=2 (0)

Traceback (most recent call last):
  _RunGpuTest at content/test/gpu/gpu_tests/gpu_integration_test.py:49
    self.RunActualGpuTest(url, *args)
  RunActualGpuTest at content/test/gpu/gpu_tests/webgl_conformance_integration_test.py:50
    self.fail(webgl_conformance._WebGLTestMessages(self.tab))
  fail at /usr/lib/python2.7/unittest/case.py:410
    raise self.failureException(msg)
AssertionError: Fail to create a context
FAIL Fail to create a context

Locals:
  msg : u'Fail to create a context\nFAIL Fail to create a context\n'

WARNING:root:Restarting browser due to unexpected test failure



Have seen this a couple of times, I think, but not sure. P3 for now.

Not sure of the resolution. Do we need to mark all the tests flaky on this configuration? That would be unfortunate.
 

Comment 1 Deleted

jglad, can you provide more info on the fix? (link to CL, bug)

Comment 3 by kbr@chromium.org, Jul 18 2016

I think the above comment is spam.

Components: Internals>GPU>Internals
Labels: -Pri-3 Pri-1
Status: Available (was: Untriaged)
Raising priority on this one: the random failures are causing the Intel Release bot to be mostly red, with one randomly failing test.
Cc: sugoi@chromium.org

Comment 6 by kbr@chromium.org, Aug 1 2016

Cc: piman@chromium.org yunchao...@intel.com qiankun....@intel.com
Any suggestion on what might be going on here? Is this a problem in Chrome, the XCB implementation, etc.? CC'ing piman@ for expert advice and Intel folks.

Cc: xinghua....@intel.com
Xinghua, could you have a look at this P1 bug? This bug is Intel specific, which make Linux intel bot failed. 
Cc: yang...@intel.com
I am looking it now. I can reproduce it on Mesa 11.2.0 but not on Mesa 12.*.

Comment 11 by piman@google.com, Aug 2 2016

[xcb] Unknown sequence number while processing queue
[xcb] Most likely this is a multi-threaded client and XInitThreads has not been called
[xcb] Aborting, sorry about that.


This sounds like something we should validate - we *should* be calling XInitThreads before creating displays, but maybe something regressed there.
Update what I found currently. I was wrong in the comment#10. With Mesa 12.*, after refreshing the test page many times, it will fail too. The failure is related to "gl = canvas.getContext("webgl")". No matter webgl or weblg2 context it intends to get.  This means webgl 1.0 context also fails on getContext if you try to refresh a test page many times (not only reported test, other tests will also trigger crash).

Comment 13 by kbr@chromium.org, Aug 2 2016

Owner: qiankun....@intel.com
Status: Assigned (was: Available)
Hmmmm. Very suspicious. It's good that you all can reproduce this on the latest Mesa. Qiankun, could I please assign this to you? Please reassign if necessary.

The GPU process already calls XInitThreads. See:

https://cs.chromium.org/chromium/src/ui/gfx/x/x11_connection.cc
https://cs.chromium.org/chromium/src/ui/gl/gl_surface_glx.cc?sq=package:chromium&rcl=1470151102&l=363

https://codereview.chromium.org/2208743005 adds logging to check if XInitThreads has been called before XSetErrorHandler (see also https://codereview.chromium.org/2206973002/ ) I am not able to reproduce the bug on a top of tree Chrome on Ubuntu Xenial, Mesa 11.2 Broadwell.

What's weird is that it is an XCB error. Do we have extra threads in the GPU process, using the same display connection, that do not call XInitThreads?

Also from http://www.remlab.net/op/xlib.shtml the X error handler is per-proces, not per-display. Could it be that an XCB instance in another thread is expecting to have its error handler called and IgnoreX11Errors catches them?

Comment 15 by kbr@chromium.org, Aug 3 2016

I think there's a thread for the SGIVideoSyncProviderShim. See https://cs.chromium.org/chromium/src/ui/gl/gl_surface_glx.cc?rcl=1470237684&l=373 .

Steps to reproduce this bug:
(1) Open conformance2/context/context-attributes-depth-stencil-antialias-obeyed.html
(2) Press F5 and hold on for more than 1 minute to keep refreshing the page.
(3) Then you will see gpu process crashes.

My testing env:
chromium: ToT
Linux: Ubuntu Wily 15.10
HW: Intel Skylake
Mesa: 11.2.0 and 12.0.1
Project Member

Comment 17 by bugdroid1@chromium.org, Aug 4 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/a1d7428578d800f6e0b474dd07c5c760021ff1e1

commit a1d7428578d800f6e0b474dd07c5c760021ff1e1
Author: qiankun.miao <qiankun.miao@intel.com>
Date: Thu Aug 04 17:03:33 2016

Fix xcb crash when creating gl context

This is a regression introduced by
https://codereview.chromium.org/2139673002. GPU process crashed due an
assertion: '!xcb_xlib_threads_sequence_lost' failed.

BUG= 628823 

Review-Url: https://codereview.chromium.org/2206973002
Cr-Commit-Position: refs/heads/master@{#409814}

[modify] https://crrev.com/a1d7428578d800f6e0b474dd07c5c760021ff1e1/ui/gl/gl_context_glx.cc

Project Member

Comment 18 by bugdroid1@chromium.org, Aug 5 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/angle/angle/+/f5207deab6e1f2e45f8adea1291e04cf1573fe3e

commit f5207deab6e1f2e45f8adea1291e04cf1573fe3e
Author: Corentin Wallez <cwallez@chromium.org>
Date: Fri Aug 05 15:51:40 2016

DisplayGLX: XSync before setting the error handler

This mirrors https://codereview.chromium.org/2206973002

BUG= 628823 

Change-Id: Ifd71d67df174cac3f90097c809fc91046699bed8
Reviewed-on: https://chromium-review.googlesource.com/366790
Reviewed-by: Jamie Madill <jmadill@chromium.org>
Commit-Queue: Corentin Wallez <cwallez@chromium.org>

[modify] https://crrev.com/f5207deab6e1f2e45f8adea1291e04cf1573fe3e/src/libANGLE/renderer/gl/glx/DisplayGLX.cpp

Project Member

Comment 19 by bugdroid1@chromium.org, Aug 8 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e02d95b012101e8a127394c1eb6fd0707b870d9a

commit e02d95b012101e8a127394c1eb6fd0707b870d9a
Author: jmadill <jmadill@chromium.org>
Date: Mon Aug 08 20:44:00 2016

Roll ANGLE 9c721c6..3416ff3

https://chromium.googlesource.com/angle/angle.git/+log/9c721c6..3416ff3

BUG=,None,628823

TEST=bots
TBR=geofflang@chromium.org

CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.win:win_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.android:android_optional_gpu_tests_rel

Review-Url: https://codereview.chromium.org/2228553002
Cr-Commit-Position: refs/heads/master@{#410449}

[modify] https://crrev.com/e02d95b012101e8a127394c1eb6fd0707b870d9a/DEPS

Comment 20 by kbr@chromium.org, Aug 8 2016

Issue 633053 has been merged into this issue.
Project Member

Comment 21 by sheriffbot@chromium.org, Aug 9 2016

Labels: Fracas FoundIn-M-54
Users experienced this crash on the following builds:

Linux Dev 54.0.2816.0 -  96.27 CPM, 186 reports, 70 clients (signature XSourcePrepare)

If this update was incorrect, please add "Fracas-Wrong" label to prevent future updates.

- Go/Fracas
Status: Fixed (was: Assigned)
This is fixed by https://codereview.chromium.org/2206973002. Close it now. Please reopen it if you still can reproduce it.

Sign in to add a comment