New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 782471 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Jan 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux
Pri: 1
Type: Bug

Blocking:
issue 771365



Sign in to add a comment

gl_unittests failing on chromium.memory/Linux CFI

Project Member Reported by sky@chromium.org, Nov 8 2017

Issue description

gl_unittests failing on chromium.memory/Linux CFI

Builders failed on: 
- Linux CFI: 
  https://build.chromium.org/p/chromium.memory/builders/Linux%20CFI


AFAICT this is the first time it failed: https://uberchromegw.corp.google.com/i/chromium.memory/builders/Linux%20CFI/builds/3528

Initially it crashed:
[ RUN      ] GLContextGLXTest.DoNotDestroyOnFailedMakeCurrent
Received signal 11 SEGV_MAPERR 000000000000
#0 0x000000692f7c base::debug::StackTrace::StackTrace()
#1 0x000000692ca4 base::debug::(anonymous namespace)::StackDumpSignalHandler()
#2 0x7f3494367330 <unknown>
#3 0x00000050db73 gl::GLSurfaceEGL::ShutdownOneOff()
#4 0x00000072bd99 gl::init::ShutdownGLPlatform()
#5 0x000000726f79 gl::init::ShutdownGL()
#6 0x000000685791 gl::GLSurfaceTestSupport::InitializeOneOffImplementation()
#7 0x0000006846af gl::GLImageTestSupport::InitializeGL()
#8 0x00000042fb4b gl::GLContextGLXTest_DoNotDestroyOnFailedMakeCurrent_Test::TestBody()
#9 0x0000004984c4 testing::Test::Run()
#10 0x000000498b7e testing::TestInfo::Run()
#11 0x0000004993c2 testing::TestCase::Run()
#12 0x00000049e633 testing::internal::UnitTestImpl::RunAllTests()
#13 0x00000049e2fd testing::UnitTest::Run()
#14 0x000000759ebd base::TestSuite::Run()
#15 0x000000430a21 _ZN4base8internal7InvokerINS0_9BindStateIMNS_9TestSuiteEFivEJNS0_17UnretainedWrapperIN12_GLOBAL__N_111GlTestSuiteEEEEEEFivEE7RunImplIRKS5_RKNSt3__15tupleIJS9_EEEJLm0EEEEiOT_OT0_NSG_16integer_sequenceImJXspT1_EEEE
#16 0x0000007704e0 base::(anonymous namespace)::LaunchUnitTestsInternal()
#17 0x0000007703a9 base::LaunchUnitTests()
#18 0x0000004306c7 main
#19 0x7f348f21ff45 __libc_start_main
#20 0x0000003ed029 <unknown>
  r8: 0000057eee2c9e00  r9: 0000000000000000 r10: 00007ffc59220590 r11: 00007f348f384390
 r12: 0000015f98c36e7b r13: 0000057eee2a9a50 r14: 0000000000000001 r15: 0000057eee2856f0
  di: 0000000000000000  si: 0000000000000001  bp: 00007ffc59220540  bx: 0000000000000001
  dx: 0000000000000006  ax: 0000000000000000  cx: a3dae03071735400  sp: 00007ffc59220540
  ip: 000000000050db73 efl: 0000000000010202 cgf: 0000000000000033 erf: 0000000000000004
 trp: 000000000000000e msk: 0000000000000000 cr2: 0000000000000000
[end of stack trace]
Calling _exit(1). Core file will not be generated.

And on rerun a different error:
[ RUN      ] GLContextGLXTest.DoNotDestroyOnFailedMakeCurrent
[28467:28467:1107/151427.765796:14203069615:ERROR:gl_surface_glx.cc(417)] glxQueryVersion failed
[28467:28467:1107/151427.765827:14203069641:ERROR:gl_initializer_x11.cc(156)] GLSurfaceGLX::InitializeOneOff failed.
Received signal 11 SEGV_MAPERR 000000000000
#0 0x000000692f7c base::debug::StackTrace::StackTrace()
#1 0x000000692ca4 base::debug::(anonymous namespace)::StackDumpSignalHandler()
#2 0x7f8096264330 <unknown>
#3 0x00000051c2eb gl::(anonymous namespace)::GetConfigForWindow()
#4 0x00000051c220 gl::NativeViewGLSurfaceGLX::GetConfig()
#5 0x00000051b64b gl::NativeViewGLSurfaceGLX::Initialize()
#6 0x0000004faff9 gl::InitializeGLSurfaceWithFormat()
#7 0x0000004fb0a6 gl::InitializeGLSurface()
#8 0x00000042fb7c gl::GLContextGLXTest_DoNotDestroyOnFailedMakeCurrent_Test::TestBody()
#9 0x0000004984c4 testing::Test::Run()
#10 0x000000498b7e testing::TestInfo::Run()
#11 0x0000004993c2 testing::TestCase::Run()
#12 0x00000049e633 testing::internal::UnitTestImpl::RunAllTests()
#13 0x00000049e2fd testing::UnitTest::Run()
#14 0x000000759ebd base::TestSuite::Run()
#15 0x000000430a21 _ZN4base8internal7InvokerINS0_9BindStateIMNS_9TestSuiteEFivEJNS0_17UnretainedWrapperIN12_GLOBAL__N_111GlTestSuiteEEEEEEFivEE7RunImplIRKS5_RKNSt3__15tupleIJS9_EEEJLm0EEEEiOT_OT0_NSG_16integer_sequenceImJXspT1_EEEE
#16 0x0000007704e0 base::(anonymous namespace)::LaunchUnitTestsInternal()
#17 0x0000007703a9 base::LaunchUnitTests()
#18 0x0000004306c7 main
#19 0x7f809111cf45 __libc_start_main
#20 0x0000003ed029 <unknown>
  r8: 0000000000000000  r9: 0000000000000000 r10: 000011989af56d00 r11: 0000000000000000
 r12: 000011989af4fb40 r13: 000011989afad800 r14: 0000000000000021 r15: 0000000003000001
  di: 0000000000000000  si: 0000000000000043  bp: 00007ffef08d14c0  bx: 000011989af4fb40
  dx: 0000000000000000  ax: 0000000000000021  cx: 000011989af56c80  sp: 00007ffef08d12a0
  ip: 000000000051c2eb efl: 0000000000010202 cgf: 0000000000000033 erf: 0000000000000004
 trp: 000000000000000e msk: 0000000000000000 cr2: 0000000000000000
[end of stack trace]

Regression range is https://test-results.appspot.com/revision_range?start=514583&end=514614 . I don't see anything immediately obvious in that range. Maybe zmo's panel fitting change? https://chromium-review.googlesource.com/c/755978/ Will ping him.
 

Comment 1 by sky@chromium.org, Nov 8 2017

Cc: kbr@chromium.org zmo@chromium.org

Comment 2 by sky@chromium.org, Nov 8 2017

zmo says his patch only effects ChromeOS and the failure is not ChromeOS, so it's not zmo's change. I don't see any changes that look relevant, but I'm not an expert here. As the test isn't consistently failing (last 4 out of 5 failures) it's possible the problematic patch is older.

Comment 3 by sky@chromium.org, Nov 8 2017

zmo says, "since this looks like GPU process service side mem corruption, I would say geofflang or the ANGLE roll might be (remotely) related."

I'll also add the bot just cycled green, so this is going to be tricky to identify.
Owner: geoffl...@chromium.org
Status: Assigned (was: Available)
Hi Geoff, it was suggested you take a look.
Status: WontFix (was: Assigned)
I don't think that ANGLE caused this issue given that it's in the GLX code but the bot has been green for a while now so I'm going to WontFix this.

Comment 7 by kbr@chromium.org, Nov 9 2017

Cc: sadrul@chromium.org
+sadrul who added that test recently. It might be flaky.

Status: Assigned (was: WontFix)
Unfortunately, this is still happening:
https://uberchromegw.corp.google.com/i/chromium.memory/builders/Linux%20CFI/builds/3587
Assigning to sadrul.  This is definitely not an ANGLE issue.
Cc: geoffl...@chromium.org
Owner: sadrul@chromium.org
Labels: -Pri-2 Pri-1
Sadrul, can you disable this test if there isn't an obvious fix?
Labels: -Sheriff-Chromium
The bot has been green for a while, so I'm going to take this off the sheriffs' attention. I don't know offhand if anything changed to actually fix the problem of if it's still just hidden flake.
-4529 and -4578

Comment 15 by kbr@chromium.org, Dec 23 2017

Blocking: 771365
Cc: p...@chromium.org vtsyrklevich@chromium.org julien.isorce@chromium.org
Components: -Internals>GPU Internals>GPU>Internals
Owner: kbr@chromium.org
I think this is more a bug related to CFI than a bug in the test. The stack trace is as follows:

[ RUN      ] GLContextGLXTest.DoNotDestroyOnFailedMakeCurrent
Received signal 11 SEGV_MAPERR 000000000000
#0 0x0000006984fc base::debug::StackTrace::StackTrace()
#1 0x000000698224 base::debug::(anonymous namespace)::StackDumpSignalHandler()
#2 0x7fa4fc9d3330 <unknown>
#3 0x000000516c23 gl::GLSurfaceEGL::ShutdownOneOff()
#4 0x000000737f69 gl::init::ShutdownGLPlatform()
#5 0x00000073313d gl::init::ShutdownGL()
#6 0x000000690063 gl::GLSurfaceTestSupport::InitializeOneOffImplementation()
#7 0x00000068f01f gl::GLImageTestSupport::InitializeGL()
#8 0x00000043615b gl::GLContextGLXTest_DoNotDestroyOnFailedMakeCurrent_Test::TestBody()
#9 0x00000049fed4 testing::Test::Run()
#10 0x0000004a058e testing::TestInfo::Run()
#11 0x0000004a0dd2 testing::TestCase::Run()
#12 0x0000004a6083 testing::internal::UnitTestImpl::RunAllTests()
#13 0x0000004a5d1d testing::UnitTest::Run()
#14 0x0000007684ed base::TestSuite::Run()
#15 0x000000437031 _ZN4base8internal7InvokerINS0_9BindStateIMNS_9TestSuiteEFivEJNS0_17UnretainedWrapperIN12_GLOBAL__N_111GlTestSuiteEEEEEEFivEE7RunImplIRKS5_RKNSt3__15tupleIJS9_EEEJLm0EEEEiOT_OT0_NSG_16integer_sequenceImJXspT1_EEEE
#16 0x00000077f150 base::(anonymous namespace)::LaunchUnitTestsInternal()
#17 0x00000077f019 base::LaunchUnitTests()

It looks to me like a bug was introduced in https://chromium-review.googlesource.com/769652 where resetting the g_angle_reset_platform function pointer obtains a write lock on the wrong piece of memory. I'll put up a CL fixing this.

Comment 16 by kbr@chromium.org, Dec 23 2017

Owner: vtsyrklevich@chromium.org
CQ'ing https://chromium-review.googlesource.com/843634 TBR'd to see if this test goes green. Assigning to Vlad to follow up and verify.

Project Member

Comment 17 by bugdroid1@chromium.org, Dec 23 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/70bdda1c7b9043f9776e63825b30ef60574994f3

commit 70bdda1c7b9043f9776e63825b30ef60574994f3
Author: Kenneth Russell <kbr@chromium.org>
Date: Sat Dec 23 02:05:37 2017

[cfi-icall] Fix ProtectedMemory usage for ANGLE icall.

Resetting the g_angle_reset_platform function pointer was incorrectly
requesting writable memory for g_angle_get_platform.

BUG= 782471 , 771365
TBR=pcc@chromium.org

Cq-Include-Trybots: master.tryserver.chromium.android:android_optional_gpu_tests_rel;master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel
Change-Id: I3c558f813f84a6495dfe530129515b36f20dfd2b
Reviewed-on: https://chromium-review.googlesource.com/843634
Commit-Queue: Kenneth Russell <kbr@chromium.org>
Reviewed-by: Kenneth Russell <kbr@chromium.org>
Cr-Commit-Position: refs/heads/master@{#526128}
[modify] https://crrev.com/70bdda1c7b9043f9776e63825b30ef60574994f3/ui/gl/angle_platform_impl.cc

Comment 18 by kbr@chromium.org, Dec 24 2017

This bot's been green since https://chromium-review.googlesource.com/843634 ; so far the last gl_unittests failure was https://ci.chromium.org/buildbot/chromium.memory/Linux%20CFI/4701 . But it's been green for long periods before that fix too. Not clear whether this failure is flaky for the same reason described in https://bugs.chromium.org/p/chromium/issues/detail?id=795332#c5 .

Status: Fixed (was: Assigned)
I believe the fix kbr@ implemented was correct, thanks for finding and fixing that.

Sign in to add a comment