Linux FYI GPU TSAN Release failing angle_end2end_tests because of timeout |
|||||
Issue descriptionThis test seems to have become extremely slow, and is timing out. First failing: https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Linux%20FYI%20GPU%20TSAN%20Release/5925 Last known good: https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Linux%20FYI%20GPU%20TSAN%20Release/5924 ANGLE regression range: https://chromium.googlesource.com/angle/angle/+/41c43ce7f74906c87c5e1c6c8cc0673f8ab854b1 Test output: https://chromium-swarm.appspot.com/task?id=3ce997cf0314a710&refresh=10&show_raw=1 The test doesn't seem to have any errors, it just gets slower and slower as the test progresses. For instance at the start of the tests, things are fast: ClientArraysTest.ForbidsClientSideElementBuffer/ES3_OPENGL (53 ms) Near the later part of the test, OpenGL tests are very very slow: Texture2DTest.ZeroSizedUploads/ES2_OPENGL (61045 ms) I'm not sure what's happening here. Can anyone offer some suggestions?
,
Apr 18 2018
OTOH, only angle_end2end_tests became slower, maybe also angle_white_box_tests a bit. So, it could be that LVL are responsible. It should be possible to reproduce this locally and see what's going on.
,
Apr 19 2018
Tried locally, definitely reproducible, running e.g. angle_end2end_tests --gtest_filter='Texture2DTest.TexStorage*' --gtest_repeat=100 --single-process-tests
The OpenGL steps keep getting slower and slower (the Vulkan ones don't seem to).
Attached a debugger, and it's pretty consistently within a stack like this:
(gdb) bt
#0 0x000000000044e100 in FreeRange() ()
at /b/build/slave/linux_upload_clang/build/src/third_party/llvm/compiler-rt/lib/tsan/rtl/tsan_sync.cc:90
#1 0x000000000044dfcc in FreeBlock() ()
at /b/build/slave/linux_upload_clang/build/src/third_party/llvm/compiler-rt/lib/tsan/rtl/tsan_sync.cc:79
#2 0x0000000000437ddf in user_free() ()
at /b/build/slave/linux_upload_clang/build/src/third_party/llvm/compiler-rt/lib/tsan/rtl/tsan_mman.cc:200
#3 0x0000000000437ddf in user_free() ()
at /b/build/slave/linux_upload_clang/build/src/third_party/llvm/compiler-rt/lib/tsan/rtl/tsan_mman.cc:170
#4 0x0000000000438110 in user_realloc() ()
at /b/build/slave/linux_upload_clang/build/src/third_party/llvm/compiler-rt/lib/tsan/rtl/tsan_mman.cc:219
#5 0x00000000003e65d9 in __interceptor_realloc() ()
at /b/build/slave/linux_upload_clang/build/src/third_party/llvm/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:695
#6 0x00007f34612dbfab in () at /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.0
warning: (Internal error: pc 0x7f3469fe54e9 in read in psymtab, but not in symtab.)
warning: (Internal error: pc 0x7f3469fe54e9 in read in psymtab, but not in symtab.)
warning: (Internal error: pc 0x7f3469fe54c0 in read in psymtab, but not in symtab.)
warning: (Internal error: pc 0x7f3469fe54e9 in read in psymtab, but not in symtab.)
#7 0x00007f34612f4ede in () at /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.0
warning: (Internal error: pc 0x7f3469fe54e9 in read in psymtab, but not in symtab.)
#8 0x00007f34612ef982 in () at /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.0
#9 0x00007f3461284f19 in () at /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.0
#10 0x00007f34612b2599 in () at /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.0
---Type <return> to continue, or q <return> to quit---
#11 0x00007f34612a815d in glXQueryExtensionsString ()
at /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.0
#12 0x00007f3469124ddc in initialize() ()
at ../../third_party/angle/src/libANGLE/renderer/gl/glx/FunctionsGLX.cpp:327
#13 0x00007f3469124ddc in initialize() ()
at ../../third_party/angle/src/libANGLE/renderer/gl/glx/FunctionsGLX.cpp:206
#14 0x00007f34691203b4 in initialize() ()
at ../../third_party/angle/src/libANGLE/renderer/gl/glx/DisplayGLX.cpp:108
warning: (Internal error: pc 0x7f3469fe54e9 in read in psymtab, but not in symtab.)
#15 0x00007f3468ecb364 in initialize() ()
at ../../third_party/angle/src/libANGLE/Display.cpp:466
warning: (Internal error: pc 0x7f3469fe54e9 in read in psymtab, but not in symtab.)
#16 0x00007f3468e56273 in Initialize() ()
at ../../third_party/angle/src/libGLESv2/entry_points_egl.cpp:87
warning: (Internal error: pc 0x7f3469fe54e9 in read in psymtab, but not in symtab.)
#17 0x00007f3469fe54ea in warning: (Internal error: pc 0x7f3469fe54e9 in read in psymtab, but not in symtab.)
eglInitializewarning: (Internal error: pc 0x7f3469fe54e9 in read in psymtab, but not in symtab.)
warning: (Internal error: pc 0x7f3469fe54e9 in read in psymtab, but not in symtab.)
()warning: (Internal error: pc 0x7f3469fe54e9 in read in psymtab, but not in symtab.)
at ../../third_party/angle/src/libEGL/libEGL.cppwarning: (Internal error: pc 0x7f3469fe54e9 in read in psymtab, but not in symtab.)
warning: (Internal error: pc 0x7f3469fe54e9 in read in psymtab, but not in symtab.)
:87
#18 0x00007f3469ffd418 in initializeDisplayAndSurface() ()
at ../../third_party/angle/util/EGLWindow.cpp:226
#19 0x0000000000a6046c in ANGLETestSetUp() ()
at ../../third_party/angle/src/tests/test_utils/ANGLETest.cpp:299
#20 0x0000000000a623fa in ANGLETest::SetUp() ()
at ../../third_party/angle/src/tests/test_utils/ANGLETest.cpp:977
#21 0x00000000008a0211 in SetUp() ()
---Type <return> to continue, or q <return> to quit---
at ../../third_party/angle/src/tests/gl_tests/TextureTest.cpp:78
#22 0x000000000089f571 in SetUp() ()
at ../../third_party/angle/src/tests/gl_tests/TextureTest.cpp:162
#23 0x00000000008a00a1 in non-virtual thunk to (anonymous namespace)::Texture2DTest::SetUp() ()
at ../../third_party/googletest/src/googletest/include/gtest/internal/gtest-linked_ptr.h:153
#24 0x0000000000a8449c in Run() ()
at ../../third_party/googletest/src/googletest/src/gtest-internal-inl.h:920
#25 0x0000000000a854fd in Run() ()
at ../../third_party/googletest/src/googletest/src/gtest.cc:2667
#26 0x0000000000a85d77 in Run() ()
at ../../third_party/googletest/src/googletest/src/gtest.cc:2785
#27 0x0000000000a963e7 in RunAllTests() ()
at ../../third_party/googletest/src/googletest/src/gtest.cc:5047
#28 0x0000000000a95cec in Run() ()
at ../../third_party/googletest/src/googletest/src/gtest-internal-inl.h:920
#29 0x0000000000a9eb67 in Run() ()
at ../../third_party/googletest/src/googletest/include/gtest/gtest.h:2327
#30 0x0000000000a9eb67 in Run() () at ../../base/test/test_suite.cc:275
#31 0x0000000000a6b583 in (anonymous namespace)::RunHelper(base::TestSuite*) ()
at ../../gpu/angle_end2end_tests_main.cc:19
#32 0x0000000000a6b5d5 in Run() () at ../../base/bind_internal.h:402
---Type <return> to continue, or q <return> to quit---
#33 0x0000000000a6b5d5 in Run() () at ../../base/bind_internal.h:530
#34 0x0000000000a6b5d5 in Run() () at ../../base/bind_internal.h:604
#35 0x0000000000a6b5d5 in Run() () at ../../base/bind_internal.h:586
#36 0x0000000000aa24c4 in LaunchUnitTestsInternal() ()
at ../../base/callback.h:95
#37 0x0000000000aa24c4 in LaunchUnitTestsInternal() ()
at ../../base/test/launcher/unit_test_launcher.cc:225
#38 0x0000000000aa2ada in LaunchUnitTestsWithOptions() ()
at ../../base/test/launcher/unit_test_launcher.cc:597
#39 0x0000000000a6b501 in main() () at ../../gpu/angle_end2end_tests_main.cc:29
,
Apr 20 2018
The resident size seems to be growing at every iteration too, suggesting a leak.
,
Apr 20 2018
->cwallez, any idea?
,
Apr 20 2018
Issue 834563 has been merged into this issue.
,
Apr 20 2018
,
Apr 20 2018
I'll take a look if I'm able to repro. Notes: - glXQueryExtensionsStrings' return value doesn't need to be freed explicitly. - The regression range doesn't seem very useful? Several ideas but I didn't check any yet: - Does the resident size grow without TSAN too? If not, then maybe TSAN's tracking data keeps growing during the test? - Does this happen on a different driver? If not it could be caused by NVIDIA's driver being multi-threaded. - Does this happen without the Vulkan backend? It could be caused by the initialization of the Vulkan validation layers.
,
Apr 20 2018
It could be related to the LVL. It's unclear, but I think that was in the regression range.
,
Apr 20 2018
FWIW I couldn't repro the RSS increase without TSAN, but it's also possible that it's a small leak (i.e. within noise) that gets magnified by TSAN. I could very well believe that glXQueryExtensionsString allocates and leak the extension string, it doesn't have a well-defined lifetime. Creating a new XDisplay (therefore a new GLX extension client-side structure in the driver) many times (once on every test) is fairly unusual. Assuming it's a leak in the driver, would it be completely unreasonable to avoid recreating/reinitializing the EGL display for every test in this test harness, and instead cache it, at least for the tests that don't care about the EGL parts of the API?
,
May 1 2018
Going to try disabling the LVL in this builder.
,
May 1 2018
The following revision refers to this bug: https://chromium.googlesource.com/angle/angle/+/ad3aaeba3e0d7f704ca84b2bac4c23e1242014c9 commit ad3aaeba3e0d7f704ca84b2bac4c23e1242014c9 Author: Jamie Madill <jmadill@chromium.org> Date: Tue May 01 12:50:52 2018 Disable Vulkan layers in sanitized builds. This was causing very slow builds/test runs. Bug: chromium:837166 Bug: chromium:834269 Change-Id: If2e5665455d4a8af13cbc732a65a07550ace8304 Reviewed-on: https://chromium-review.googlesource.com/1036220 Reviewed-by: Jamie Madill <jmadill@chromium.org> [modify] https://crrev.com/ad3aaeba3e0d7f704ca84b2bac4c23e1242014c9/gni/angle.gni
,
May 1 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/a3ffbd46d8a7667f14738fdf42796f344e3ced64 commit a3ffbd46d8a7667f14738fdf42796f344e3ced64 Author: angle-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com <angle-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Date: Tue May 01 17:49:32 2018 Roll src/third_party/angle/ ddd772455..ad3aaeba3 (1 commit) https://chromium.googlesource.com/angle/angle.git/+log/ddd772455ce7..ad3aaeba3e0d $ git log ddd772455..ad3aaeba3 --date=short --no-merges --format='%ad %ae %s' 2018-05-01 jmadill Disable Vulkan layers in sanitized builds. Created with: roll-dep src/third_party/angle BUG=chromium:837166, chromium:834269 The AutoRoll server is located here: https://angle-chromium-roll.skia.org Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, please contact the current sheriff, who should be CC'd on the roll, and stop the roller if necessary. CQ_INCLUDE_TRYBOTS=luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel TBR=cwallez@chromium.org No-Try: True Change-Id: I469040e3c7b571ae7d469330c2f0a142215ced61 Reviewed-on: https://chromium-review.googlesource.com/1036500 Commit-Queue: Jamie Madill <jmadill@chromium.org> Commit-Queue: angle-chromium-autoroll <angle-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Reviewed-by: angle-chromium-autoroll <angle-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Cr-Commit-Position: refs/heads/master@{#555103} [modify] https://crrev.com/a3ffbd46d8a7667f14738fdf42796f344e3ced64/DEPS
,
May 2 2018
,
May 9 2018
ClusterFuzz testcase 4614092258803712 is still reproducing on tip-of-tree build (trunk). Please re-test your fix against this testcase and if the fix was incorrect or incomplete, please re-open the bug. Otherwise, ignore this notification and add ClusterFuzz-Wrong label.
,
May 11 2018
Was not able to repro ToT. Also checked that we don't need to free the string returned by glXQueryExtensionsString(): https://github.com/mesa3d/mesa/blob/f94597f554d284e8bedf0d00e3ad9e805306548f/src/glx/glxcmds.c#L1298 https://github.com/mesa3d/mesa/blob/6d6b1b3890a24048a702c3a032ccfa29644bd580/src/gallium/state_trackers/glx/xlib/glx_api.c#L1762
,
May 14 2018
The following revision refers to this bug: https://chromium.googlesource.com/angle/angle/+/f345cdf37b81690c6942e64986d0d276531f38bd commit f345cdf37b81690c6942e64986d0d276531f38bd Author: Corentin Wallez <cwallez@chromium.org> Date: Mon May 14 19:54:56 2018 DisplayGLX: Close the X display if we own it. BUG= chromium:834269 Change-Id: Ia49f80f4c057ad467428a13e8cd4ca54ad48d5c4 Reviewed-on: https://chromium-review.googlesource.com/1058084 Reviewed-by: Jamie Madill <jmadill@chromium.org> Reviewed-by: Geoff Lang <geofflang@chromium.org> Commit-Queue: Corentin Wallez <cwallez@chromium.org> [modify] https://crrev.com/f345cdf37b81690c6942e64986d0d276531f38bd/src/libANGLE/renderer/gl/glx/DisplayGLX.cpp
,
May 14 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/9e7a5bd04dc7b4efc0874abc33986b392b2fe793 commit 9e7a5bd04dc7b4efc0874abc33986b392b2fe793 Author: angle-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com <angle-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Date: Mon May 14 23:45:14 2018 Roll src/third_party/angle/ 5d2ccc534..5730c0bf4 (9 commits) https://chromium.googlesource.com/angle/angle.git/+log/5d2ccc534d26..5730c0bf431e $ git log 5d2ccc534..5730c0bf4 --date=short --no-merges --format='%ad %ae %s' 2018-05-14 geofflang Add more dEQP EGL expectations for Linux and Android. 2018-05-14 cwallez DisplayGLX: Close the X display if we own it. 2018-05-14 geofflang Add more dEQP EGL expectations for Linux and Android. 2018-05-14 geofflang DEQP: Print not supported messages from tests. 2018-05-14 geofflang Add dEQP EGL test expectations for Linux and Android. 2018-05-10 jmadill Vulkan: Implement masked color clear with depth. 2018-05-14 jmadill Fix libGLESv2 wrong .def file. 2018-05-14 jmadill Fix warnings from size_t conversions. 2018-04-23 lfy GLES1: Renderer (minimal) Created with: roll-dep src/third_party/angle BUG= chromium:834269 , chromium:842028 The AutoRoll server is located here: https://angle-chromium-roll.skia.org Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, please contact the current sheriff, who should be CC'd on the roll, and stop the roller if necessary. CQ_INCLUDE_TRYBOTS=luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel TBR=ynovikov@chromium.org Change-Id: I1ac4625bc1c520a30186f260160dffbdf5787693 Reviewed-on: https://chromium-review.googlesource.com/1058172 Commit-Queue: angle-chromium-autoroll <angle-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Reviewed-by: angle-chromium-autoroll <angle-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Cr-Commit-Position: refs/heads/master@{#558531} [modify] https://crrev.com/9e7a5bd04dc7b4efc0874abc33986b392b2fe793/DEPS |
|||||
►
Sign in to add a comment |
|||||
Comment 1 by ynovikov@chromium.org
, Apr 18 2018