New issue
Advanced search Search tips

Issue 697758 link

Starred by 15 users

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 2
Type: Bug

Blocked on:
issue 849576

Blocking:
issue 704926



Sign in to add a comment

Profile and optimize shader translator

Project Member Reported by kbr@chromium.org, Mar 2 2017

Issue description

There has been vigorous discussion on the public_webgl mailing list at Khronos about the cost of shader compilation and program linking. The thread list so far:

https://www.khronos.org/webgl/public-mailing-list/archives/1702/threads.php#00039
https://www.khronos.org/webgl/public-mailing-list/archives/1703/threads.php#00002

In this recent message:

https://www.khronos.org/webgl/public-mailing-list/archives/1703/msg00002.php

Maksims Mihejevs from PlayCanvas provided a test case (attached) which loads all of the shaders from PlayCanvas' demo After the Flood ( https://playcanv.as/e/p/44MRmJRU/ ). This represents a real-world app.

Max posted some times from several platforms:

Windows; GTX 880M; ANGLE (DX11); i7-4810MQ; Chrome 58; ~5,300 ms
Windows; GTX 1070; ANGLE (DX11); i7-6820HK; Chrome 56; ~7,200 ms
Windows; GTX 980; ANGLE (DX11); i7-4790; Chrome 56; ~6,300 ms
Windows; GTX 980 Ti; ANGLE (DX11); i7 3770k; Chrome 56; ~5,100 ms
Windows; GTX 970; ANGLE (DX11); i7-3770; Firefox 52; ~7,600 ms
Windows; GTX 970; ANGLE (DX11); i7-3770; Chrome 58; ~5,500 ms
Windows; GTX 970; (GL); i7-3770; Chrome 58; ~3,200 ms
Windows; Intel HD 4000; ANGLE (DX11); Chrome 56; ~11,000 ms
Linux; GTX 970; i7-6700; Chrome 56; ~1,900 ms
Linux; GTX 670; i7-6700; Firefox 51; ~2,100 ms
Linux; GTX 980 Ti; i5 4690k; Chrome 56; ~2,200 ms
Mac Book 13" late 2013; Inter Iris; i7; ~900 ms
iMac 27" late 2012; GTX 660M; i5; Chrome 56; ~900 ms
iMac 27" late 2012; GTX 660M; i5; Firefox 51; ~930 ms
Android; One Plus Two; Adreno 430; Firefox 52; ~9,600 ms
Android; One Plus Two; Adreno 430; Chrome 58; ~11,600 ms
Android; One Plus Three; Adreno 530; Chrome 56; ~7,500 ms

Here is information from two machines I had readily available:
MacBook Pro Retina 15", Late 2013, NVIDIA GeForce GT 750M, Chrome 58.0.3025.3:
806.67ms - shaders compiled, linked and parameters got in 
617.75ms - shader compilation 
180.27ms - program link 
8.65ms - program attributes & uniforms 

Google Pixel, Adreno 530, Chrome 58.0.3019.3:
6089.51ms - shaders compiled, linked and parameters got in 
3303.56ms - shader compilation 
2725.89ms - program link 
60.06ms - program attributes & uniforms 

Many of these configurations yield interesting comparisons. On the whole, comparing configurations like Windows/GL and macOS makes it seem that significant time is being spent in the shader translator.

A couple of things are already known. First, the bulk of ANGLE's shader compilation time on Windows with the D3D11 backend is spent in Microsoft's HLSL compiler. (Thanks to the ANGLE team and I think jmadill@ in particular for doing this analysis.) This limits the potential speedup on that configuration without "esoteric" ideas. Second, a smartphone is about 1/10 the speed of a desktop machine. Therefore the speed comparison between the MBP Retina and Pixel phone isn't extraordinary.

The speed of light on these various platforms (i.e., how long does the system's GLSL compiler take to compile them, without any WebGL-related translation overhead) also isn't known.

Nonetheless: we should use this test case to profile the shader translator and see how it could be optimized. Likely the test case will need to be modified to run in a loop for that purpose. Creating a minimal native harness, bypassing the browser, to compile the same shaders (either on Android or on desktop OpenGL) should ideally also be done, to know the theoretical maximum speedup.

 
shader-cache.zip
85.8 KB Download
Owner: lethalantidote@chromium.org
Status: Assigned (was: Available)
to lethalantidote and to anyone profiling the shader translator, ANGLE has a native test suite that runs a performance test with a timing harness for benchmarking optimizations. Build angle_perftests, and run "angle/scripts/perf_test_runner.py LinkProgramBenchmark.Run/d3d11", or substitute "LinkProgramBenchmark.Run/gl" for the OpenGL back-end. The perf test runner will run the test in an infinite loop and print average time of each run, standard deviation, and truncated versions of the average and stddev (with the highest and lowest values clipped from the set of timings).

This currently runs on Windows and Linux. You can substitute in more complex shaders into angle/src/tests/perf_tests/LinkProgramPerfTest.cpp if you need.
(Note that I always use the perf test runner with standalone ANGLE, but you can probably get it to work with the Chromium build as well.)

Comment 4 by kbr@chromium.org, Mar 24 2017

Blocking: 704926
Status: Started (was: Assigned)
I was able to profile the shader translator with the given samples (each sample 10 times). The VS Analyzer pointed to two potential places for improvement, yyparse() (https://cs.chromium.org/chromium/src/third_party/angle/src/compiler/translator/glslang_tab.cpp?rcl=5978e28d3adc30d09c5cbdf51aa1e4b8dbfff6b1&l=2248) and the SymbolTable's TMap::find() (https://cs.chromium.org/chromium/src/third_party/angle/src/compiler/translator/SymbolTable.h?rcl=5978e28d3adc30d09c5cbdf51aa1e4b8dbfff6b1&l=235).

yyparse takes around 20% of the total translation time. However, it is a generated function produced by the parser generator. To make improvements would require to use a different/updated parser generator. This may not be feasible due to compatibility issues found previously. It is also unclear if the minor speedup would be worth it, considering the main bulk of the time spent in compilation happens in the HLSL compiler.

Time spent in TMap's find() function made up 8.28% of total compilation time. Most of those calls came from its use in the SymbolTable. By having the SymbolTable use an std::unsorted_map instead of a std::map, one could see about a 5% difference. I have attached the trials run with the parameters s=w2, b=h11. On average, a clean copy of the translator would take 1.7833 seconds to run through the test case. With an unordered_map SymbolTable, the average time was 1.6735. Although there was a difference, this difference may not be noticeable when included with the HLSL portion of compilation. 

Shader Translator Compilation Times Comparison.xlsx
3.7 KB Download
Thanks CJ for the analysis. Given the simplicity of making the change, I think we should definitely take the small savings even if it's dwarfed by D3DCompile time. It will help on other platforms like Android as well.
Project Member

Comment 7 by bugdroid1@chromium.org, Jun 6 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/angle/angle/+/10bed9fc0634ed2a1fa2061d4c9dfe9068e523af

commit 10bed9fc0634ed2a1fa2061d4c9dfe9068e523af
Author: Jamie Madill <jmadill@chromium.org>
Date: Tue Jun 06 17:20:17 2017

Minor optimizations to DynamicHLSL.

This makes us use std::ostringstream in more places, instead of
string concatenation.

BUG=chromium:697758

Change-Id: Ifdcaa2e7e119664fc9cfdc566ea13b519a294714
Reviewed-on: https://chromium-review.googlesource.com/521729
Commit-Queue: Jamie Madill <jmadill@chromium.org>
Reviewed-by: Geoff Lang <geofflang@chromium.org>
Reviewed-by: Corentin Wallez <cwallez@chromium.org>

[modify] https://crrev.com/10bed9fc0634ed2a1fa2061d4c9dfe9068e523af/src/libANGLE/renderer/d3d/DynamicHLSL.cpp
[modify] https://crrev.com/10bed9fc0634ed2a1fa2061d4c9dfe9068e523af/src/libANGLE/renderer/d3d/DynamicHLSL.h

Project Member

Comment 9 by bugdroid1@chromium.org, Jun 16 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/angle/angle/+/e72595b460d3b160bcc177f505a3c51a118e3c73

commit e72595b460d3b160bcc177f505a3c51a118e3c73
Author: Jamie Madill <jmadill@chromium.org>
Date: Fri Jun 16 15:13:44 2017

Rename EOpFaceForward to EOpFaceforward.

This mirrors the spec naming and makes auto-gen a little easier.

BUG=chromium:697758

Change-Id: I9bcbc2c874b9a93a6d542aedf2b239f01ee708ce
Reviewed-on: https://chromium-review.googlesource.com/526393
Reviewed-by: Olli Etuaho <oetuaho@nvidia.com>
Reviewed-by: Jamie Madill <jmadill@chromium.org>
Commit-Queue: Jamie Madill <jmadill@chromium.org>

[modify] https://crrev.com/e72595b460d3b160bcc177f505a3c51a118e3c73/src/compiler/translator/Intermediate.cpp
[modify] https://crrev.com/e72595b460d3b160bcc177f505a3c51a118e3c73/src/compiler/translator/BuiltInFunctionEmulatorHLSL.cpp
[modify] https://crrev.com/e72595b460d3b160bcc177f505a3c51a118e3c73/src/compiler/translator/IntermNode.cpp
[modify] https://crrev.com/e72595b460d3b160bcc177f505a3c51a118e3c73/src/compiler/translator/Operator.cpp
[modify] https://crrev.com/e72595b460d3b160bcc177f505a3c51a118e3c73/src/compiler/translator/OutputGLSLBase.cpp
[modify] https://crrev.com/e72595b460d3b160bcc177f505a3c51a118e3c73/src/compiler/translator/OutputHLSL.cpp
[modify] https://crrev.com/e72595b460d3b160bcc177f505a3c51a118e3c73/src/compiler/translator/Operator.h
[modify] https://crrev.com/e72595b460d3b160bcc177f505a3c51a118e3c73/src/compiler/translator/Initialize.cpp

Project Member

Comment 10 by bugdroid1@chromium.org, Jun 16 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/angle/angle/+/e72595b460d3b160bcc177f505a3c51a118e3c73

commit e72595b460d3b160bcc177f505a3c51a118e3c73
Author: Jamie Madill <jmadill@chromium.org>
Date: Fri Jun 16 15:13:44 2017

Rename EOpFaceForward to EOpFaceforward.

This mirrors the spec naming and makes auto-gen a little easier.

BUG=chromium:697758

Change-Id: I9bcbc2c874b9a93a6d542aedf2b239f01ee708ce
Reviewed-on: https://chromium-review.googlesource.com/526393
Reviewed-by: Olli Etuaho <oetuaho@nvidia.com>
Reviewed-by: Jamie Madill <jmadill@chromium.org>
Commit-Queue: Jamie Madill <jmadill@chromium.org>

[modify] https://crrev.com/e72595b460d3b160bcc177f505a3c51a118e3c73/src/compiler/translator/Intermediate.cpp
[modify] https://crrev.com/e72595b460d3b160bcc177f505a3c51a118e3c73/src/compiler/translator/BuiltInFunctionEmulatorHLSL.cpp
[modify] https://crrev.com/e72595b460d3b160bcc177f505a3c51a118e3c73/src/compiler/translator/IntermNode.cpp
[modify] https://crrev.com/e72595b460d3b160bcc177f505a3c51a118e3c73/src/compiler/translator/Operator.cpp
[modify] https://crrev.com/e72595b460d3b160bcc177f505a3c51a118e3c73/src/compiler/translator/OutputGLSLBase.cpp
[modify] https://crrev.com/e72595b460d3b160bcc177f505a3c51a118e3c73/src/compiler/translator/OutputHLSL.cpp
[modify] https://crrev.com/e72595b460d3b160bcc177f505a3c51a118e3c73/src/compiler/translator/Operator.h
[modify] https://crrev.com/e72595b460d3b160bcc177f505a3c51a118e3c73/src/compiler/translator/Initialize.cpp

Project Member

Comment 11 by bugdroid1@chromium.org, Jun 20 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/angle/angle/+/35bcad4295ac9db8a2a13dbf2bdf004d95ace79f

commit 35bcad4295ac9db8a2a13dbf2bdf004d95ace79f
Author: Jamie Madill <jmadill@chromium.org>
Date: Tue Jun 20 21:19:24 2017

Optimize builtin function emulator class.

This refactor uses a generator to produce static arrays instead of
using a bunch of std::map inserting statements. It speeds up shader
translation because every shader compile would create and tear down
this table.

Currently it is implemented as a flat array, but in the future we
could use compile-time hashing to implement faster lookup.

BUG=chromium:697758

Change-Id: I689f7de4d9b2c8c76095bb313f4c040116fc61d2
Reviewed-on: https://chromium-review.googlesource.com/521226
Commit-Queue: Jamie Madill <jmadill@chromium.org>
Reviewed-by: Corentin Wallez <cwallez@chromium.org>
Reviewed-by: Olli Etuaho <oetuaho@nvidia.com>

[add] https://crrev.com/35bcad4295ac9db8a2a13dbf2bdf004d95ace79f/src/compiler/translator/emulated_builtin_functions_hlsl_autogen.cpp
[modify] https://crrev.com/35bcad4295ac9db8a2a13dbf2bdf004d95ace79f/src/compiler.gypi
[modify] https://crrev.com/35bcad4295ac9db8a2a13dbf2bdf004d95ace79f/src/compiler/translator/BuiltInFunctionEmulator.cpp
[add] https://crrev.com/35bcad4295ac9db8a2a13dbf2bdf004d95ace79f/src/compiler/translator/emulated_builtin_function_data_hlsl.json
[add] https://crrev.com/35bcad4295ac9db8a2a13dbf2bdf004d95ace79f/src/compiler/translator/ParamType.h
[add] https://crrev.com/35bcad4295ac9db8a2a13dbf2bdf004d95ace79f/src/compiler/translator/gen_emulated_builtin_function_tables.py
[modify] https://crrev.com/35bcad4295ac9db8a2a13dbf2bdf004d95ace79f/src/compiler/translator/BuiltInFunctionEmulator.h
[modify] https://crrev.com/35bcad4295ac9db8a2a13dbf2bdf004d95ace79f/src/compiler/translator/BuiltInFunctionEmulatorHLSL.cpp

Project Member

Comment 12 by bugdroid1@chromium.org, Jun 22 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/d0db75d13114ecc183ef741853b4475fe8d132ac

commit d0db75d13114ecc183ef741853b4475fe8d132ac
Author: Yuly Novikov <ynovikov@chromium.org>
Date: Thu Jun 22 18:23:32 2017

Roll ANGLE 12b0b39..579d8c7

https://chromium.googlesource.com/angle/angle.git/+log/12b0b39..579d8c7

BUG=chromium:727671,728226,chromium:682815,731089,chromium:697758

TBR=jmadill@chromium.org

TEST=bots

CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.win:win_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.android:android_optional_gpu_tests_rel

Change-Id: Iccc5538b77641dce7ecfb9adb25930b9d916806b
Reviewed-on: https://chromium-review.googlesource.com/545056
Reviewed-by: Yuly Novikov <ynovikov@chromium.org>
Reviewed-by: Jamie Madill <jmadill@chromium.org>
Commit-Queue: Yuly Novikov <ynovikov@chromium.org>
Cr-Commit-Position: refs/heads/master@{#481610}
[modify] https://crrev.com/d0db75d13114ecc183ef741853b4475fe8d132ac/DEPS

Project Member

Comment 13 by bugdroid1@chromium.org, Jun 23 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/angle/angle/+/8033165b5360987c4899b3f755b9db3b287f3ad6

commit 8033165b5360987c4899b3f755b9db3b287f3ad6
Author: Jamie Madill <jmadill@chromium.org>
Date: Fri Jun 23 17:17:27 2017

Don't use constexpr pair constructor in translator.

Although this seems to compile and pass on our bots, the std::pair
constructor with arguments is not constexpr until c++14. Instead
use a helper struct which achieves the same goal.

BUG=chromium:697758

Change-Id: I0f9873729485a5059f79af969cb56f84706e6c98
Reviewed-on: https://chromium-review.googlesource.com/545796
Reviewed-by: Jamie Madill <jmadill@chromium.org>
Commit-Queue: Jamie Madill <jmadill@chromium.org>

[modify] https://crrev.com/8033165b5360987c4899b3f755b9db3b287f3ad6/src/compiler/translator/gen_emulated_builtin_function_tables.py
[modify] https://crrev.com/8033165b5360987c4899b3f755b9db3b287f3ad6/src/compiler/translator/emulated_builtin_functions_hlsl_autogen.cpp

 Issue angleproject:2089  has been merged into this issue.
Project Member

Comment 15 by bugdroid1@chromium.org, Jun 27 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/4b7e488dd71f5208a75a17b8c28af6b8a246a3d8

commit 4b7e488dd71f5208a75a17b8c28af6b8a246a3d8
Author: jmadill <jmadill@chromium.org>
Date: Tue Jun 27 16:10:37 2017

Roll ANGLE 579d8c7..0dc9781

https://chromium.googlesource.com/angle/angle.git/+log/579d8c7..0dc9781

BUG=chromium:697758

TBR=geofflang@chromium.org

TEST=bots

CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.win:win_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.android:android_optional_gpu_tests_rel

Review-Url: https://codereview.chromium.org/2955073003
Cr-Commit-Position: refs/heads/master@{#482640}

[modify] https://crrev.com/4b7e488dd71f5208a75a17b8c28af6b8a246a3d8/DEPS

I assume https://chromium-review.googlesource.com/c/526672/ is from this bug?
Crashing stack from angle_unittests on Nexus 6 is:

Stack Trace:
  RELADDR   FUNCTION                                                                                                                                                 FILE:LINE
  v------>  getblock32                                                                                                                                               /b/c/b/Android_Release__Nexus_6_/src/third_party/angle/src/common/third_party/murmurhash/MurmurHash3.cpp:57
  002356de  MurmurHash3_x86_32                                                                                                                                       /b/c/b/Android_Release__Nexus_6_/src/third_party/angle/src/common/third_party/murmurhash/MurmurHash3.cpp:112
  00221ddb  operator()                                                                                                                                               /b/c/b/Android_Release__Nexus_6_/src/third_party/angle/src/compiler/translator/Common.h:134
  v------>  operator()                                                                                                                                               /b/c/b/Android_Release__Nexus_6_/src/third_party/android_tools/ndk/sources/cxx-stl/llvm-libc++/libcxx/include/unordered_map:381
  00221c05  __construct_node<const std::__ndk1::pair<const std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, pool_allocator<char> >, sh::TSymbol *> >  /b/c/b/Android_Release__Nexus_6_/src/third_party/android_tools/ndk/sources/cxx-stl/llvm-libc++/libcxx/include/__hash_table:2086
  00221baf  __insert_unique<const std::__ndk1::pair<const std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, pool_allocator<char> >, sh::TSymbol *> >   /b/c/b/Android_Release__Nexus_6_/src/third_party/android_tools/ndk/sources/cxx-stl/llvm-libc++/libcxx/include/__hash_table:1868
  v------>  insert<const std::__ndk1::pair<const std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, pool_allocator<char> >, sh::TSymbol *>, void>       /b/c/b/Android_Release__Nexus_6_/src/third_party/android_tools/ndk/sources/cxx-stl/llvm-libc++/libcxx/include/unordered_map:907
  002205a3  insert                                                                                                                                                   /b/c/b/Android_Release__Nexus_6_/src/third_party/angle/src/compiler/translator/SymbolTable.cpp:115
  v------>  insert                                                                                                                                                   /b/c/b/Android_Release__Nexus_6_/src/third_party/angle/src/compiler/translator/SymbolTable.h:324
  002212f1  insertBuiltIn                                                                                                                                            /b/c/b/Android_Release__Nexus_6_/src/third_party/angle/src/compiler/translator/SymbolTable.cpp:429
  00220fa1  insertBuiltIn                                                                                                                                            /b/c/b/Android_Release__Nexus_6_/src/third_party/angle/src/compiler/translator/SymbolTable.cpp:382
  00221837  insertBuiltInOp                                                                                                                                          /b/c/b/Android_Release__Nexus_6_/src/third_party/angle/src/compiler/translator/SymbolTable.cpp:445
  002032dd  InsertBuiltInFunctions                                                                                                                                   /b/c/b/Android_Release__Nexus_6_/src/third_party/angle/src/compiler/translator/Initialize.cpp:47
  001fd6af  sh::TCompiler::InitBuiltInSymbolTable(ShBuiltInResources const&)                                                                                         /b/c/b/Android_Release__Nexus_6_/src/third_party/angle/src/compiler/translator/Compiler.cpp:612
  001fd5bf  sh::TCompiler::Init(ShBuiltInResources const&)                                                                                                           /b/c/b/Android_Release__Nexus_6_/src/third_party/angle/src/compiler/translator/Compiler.cpp:254
  000d4223  initTranslator                                                                                                                                           /b/c/b/Android_Release__Nexus_6_/src/third_party/angle/src/tests/compiler_tests/CollectVariables_test.cpp:41
  000d6b29  CollectVariablesTest::SetUp()                                                                                                                            /b/c/b/Android_Release__Nexus_6_/src/third_party/angle/src/tests/compiler_tests/CollectVariables_test.cpp:34
  001db63f  Run                                                                                                                                                      /b/c/b/Android_Release__Nexus_6_/src/third_party/googletest/src/googletest/src/gtest.cc:2467
  001dbae9  Run                                                                                                                                                      /b/c/b/Android_Release__Nexus_6_/src/third_party/googletest/src/googletest/src/gtest.cc:2653
  001dbd4f  Run                                                                                                                                                      /b/c/b/Android_Release__Nexus_6_/src/third_party/googletest/src/googletest/src/gtest.cc:2771
  001de943  RunAllTests                                                                                                                                              /b/c/b/Android_Release__Nexus_6_/src/third_party/googletest/src/googletest/src/gtest.cc:4648
  001de797  Run                                                                                                                                                      /b/c/b/Android_Release__Nexus_6_/src/third_party/googletest/src/googletest/src/gtest.cc:4256
  001e1ec3  Run                                                                                                                                                      /b/c/b/Android_Release__Nexus_6_/src/base/test/test_suite.cc:271
  00127f93  (anonymous namespace)::RunHelper(base::TestSuite*)                                                                                                       /b/c/b/Android_Release__Nexus_6_/src/gpu/angle_unittest_main.cc:19
  v------>  LaunchUnitTestsInternal                                                                                                                                  /b/c/b/Android_Release__Nexus_6_/src/base/test/launcher/unit_test_launcher.cc:192
  001e30b1  LaunchUnitTestsSerially                                                                                                                                  /b/c/b/Android_Release__Nexus_6_/src/base/test/launcher/unit_test_launcher.cc:470
  00127f59  main                                                                                                                                                     /b/c/b/Android_Release__Nexus_6_/src/gpu/angle_unittest_main.cc:29
  v------>  RunTests                                                                                                                                                 /b/c/b/Android_Release__Nexus_6_/src/testing/android/native_test/native_test_launcher.cc:131
  001e11b3  Java_org_chromium_native_1test_NativeTest_nativeRunTests                                                                                                 /b/c/b/Android_Release__Nexus_6_/src/out/Release/gen/testing/android/native_test/native_test_jni_headers/testing/jni/NativeTest_jni.h:49
  00169951  <unknown>                                                                                                                                                /data/dalvik-cache/arm/data@app@org.chromium.native_test-1@base.apk@classes.dex


lethalantidote@: I suspect the issue here is that MurmurHash3_x86_32 *only* works on x86. It probably makes assumptions about endianness, or something, that aren't true on ARM.

I guess we'll need another hash algorithm for ARM...? -_-
Looking at the comment:

// Block read - if your platform needs to do endian-swapping or can only
// handle aligned reads, do the conversion here

Yuly how did you get that stack trace? (Also, if you link to the crashing build as well, that can help). I dug up the build (https://build.chromium.org/p/chromium.gpu.fyi/builders/Android%20Release%20%28Nexus%206%29/builds/7865) but couldn't find the exception text. Maybe there's an indication if this is an unaligned read or something of the kind.
I've logged in to the trybot, ran angle_unittests and collected logcat, and then symbolised it.

Here is the log before the stack trace:
signal 7 (SIGBUS), code 1, fault addr 0xa0d11809 in tid 4801 (st:test_process)
pid: 4801, tid: 4801, name: st:test_process  >>> org.chromium.native_test:test_process <<<
signal 7 (SIGBUS), code 1 (BUS_ADRALN), fault addr 0xa0d11809
     r0 a0d11809  r1 1b873593  r2 00000000  r3 befb199c
     r4 00000001  r5 cc9e2d51  r6 16a88000  r7 00000007
     r8 00000007  r9 00000001  sl 00000000  fp 00000000
     ip a0d1180d  sp befb1980  lr befb199c  pc a262b6de

I guess BUS_ADRALN means it's an alignment issue?
Yup, exactly, thanks. Learn something new every day. Not sure what alignment it needs or why it has to be aligned, or how to fix that. Although it theoretically could be easy.
Right, so this might be a problem because the pool allocator doesn't do aligned allocations, and we're doing 32-bit reads from unaligned memory. Solutions are to

1) do 4 8-bit reads instead to read a block.
2) the same but do 32-bit reads when the block is aligned (you can check)
3) use a different hashing implementation.

1+2 don't sound too difficult, although a bit annoying compared to how easy we thought it would be.
Can we make the pool allocator honor alignment constraints instead? Or align everything to 8bytes?
Yeah, I thought that would be hard so I didn't list it, but maybe that's even easier.
@kainino: Looks like MurmerHash3.cpp is now included in the build twice on Linux (Ubuntu 14.04 64-bit without sysroot). I'm getting the following duplicate symbol error when building my Chromium-based application as Chromium master revision ff259bab (#488528):

FAILED: libcef.so libcef.so.TOC 
python "../../build/toolchain/gcc_solink_wrapper.py" --readelf="readelf" --nm="nm" --sofile="./libcef.so" --tocfile="./libcef.so.TOC" --output="./libcef.so"  -- ../../third_party/llvm-build/Release+Asserts/bin/clang++ -shared -Wl,--fatal-warnings -fPIC -Wl,-z,noexecstack -Wl,-z,now -Wl,-z,relro -fuse-ld=gold -B../../third_party/binutils/Linux_x64/Release/bin -Wl,--threads -Wl,--thread-count=4 -nodefaultlibs -m64 -Werror -Wl,-O1 -Wl,--gc-sections -fsanitize=address -fsanitize-address-use-after-scope -Wl,--export-dynamic -o "./libcef.so" -Wl,-soname="libcef.so" @"./libcef.so.rsp"
../../third_party/binutils/Linux_x64/Release/bin/ld.gold: error: obj/third_party/smhasher/libmurmurhash3.a(MurmurHash3.o): multiple definition of 'MurmurHash3_x86_32(void const*, int, unsigned int, void*)'
../../third_party/binutils/Linux_x64/Release/bin/ld.gold: obj/third_party/angle/libangle_common.a(MurmurHash3.o): previous definition here
../../third_party/binutils/Linux_x64/Release/bin/ld.gold: error: obj/third_party/smhasher/libmurmurhash3.a(MurmurHash3.o): multiple definition of 'MurmurHash3_x64_128(void const*, int, unsigned int, void*)'
../../third_party/binutils/Linux_x64/Release/bin/ld.gold: obj/third_party/angle/libangle_common.a(MurmurHash3.o): previous definition here
../../third_party/binutils/Linux_x64/Release/bin/ld.gold: error: obj/third_party/smhasher/libmurmurhash3.a(MurmurHash3.o): multiple definition of 'MurmurHash3_x86_128(void const*, int, unsigned int, void*)'
../../third_party/binutils/Linux_x64/Release/bin/ld.gold: obj/third_party/angle/libangle_common.a(MurmurHash3.o): previous definition here
clang: error: linker command failed with exit code 1 (use -v to see invocation)
ninja: build stopped: cannot make progress due to previous errors.


@comment#24: Partially reverting the commit from comment#8 fixes the build error for me. This is because Chromium's third_party/angle/BUILD.gn includes "libangle_common_sources" in the "angle_common" GN target which is a dependency of content/browser. On the other hand, the "libangle_sources" are included in the "libANGLE" GN target which is only a dependency of the libGLESv2 shared library.

diff --git src/libGLESv2.gypi src/libGLESv2.gypi
index e0167d7..937dda0 100644
--- src/libGLESv2.gypi
+++ src/libGLESv2.gypi
@@ -36,8 +36,6 @@
             'common/third_party/base/anglebase/sha1.cc',
             'common/third_party/base/anglebase/sha1.h',
             'common/third_party/base/anglebase/sys_byteorder.h',
-            'common/third_party/murmurhash/MurmurHash3.cpp',
-            'common/third_party/murmurhash/MurmurHash3.h',
             'common/tls.cpp',
             'common/tls.h',
             'common/utilities.cpp',
@@ -123,6 +121,8 @@
         [
             'common/event_tracer.cpp',
             'common/event_tracer.h',
+            'common/third_party/murmurhash/MurmurHash3.cpp',
+            'common/third_party/murmurhash/MurmurHash3.h',
             'libANGLE/AttributeMap.cpp',
             'libANGLE/AttributeMap.h',
             'libANGLE/BinaryStream.h',

Comment 26 by kbr@chromium.org, Jul 27 2017

Thanks marshall@ for tracking this down.

Can you propose a fix that doesn't involve reverting #8? Can we hide the symbols of ANGLE's copy of murmurhash3, or put them all in an ANGLE-scoped namespace?

Cc: phajdan.jr@chromium.org thomasanderson@chromium.org mmoss@chromium.org
@comment#26: I'm not sure what the best build-config solution would be on Linux. CC'ing some people who might know.
We should scope ANGLE's copy in a namespace I think. This should be very easy, and it would ensure we don't conflict with smhasher.
Alternately we could rename the entry points if ANGLE's version is a c (not cpp) file.
Owner: kainino@chromium.org
Status: Assigned (was: Started)
Sorry for the delay (due to SIGGRAPH). I will try to take care of this issue.

marshall: Since #8 was able to land without breaking the Chromium CQ, it doesn't seem to affect chromium builds in all configurations. Do you have instructions on how to reproduce this in a vanilla chromium build? (Does it require particular gn args or something?)

P.S. At some point I will try to also see if I can reland
https://chromium.googlesource.com/angle/angle/+/c14348a2589e5d995f63019433175545ba90040b
which was reverted in
https://chromium.googlesource.com/angle/angle/+/af01a064062d54540801789a8683d34f0c6a20bb
after doing one of the fixes in #21 or #22.
I read somewhere(...) that MurmurHash3_x86_32 _can_ work on ARM (I was wrong in #17). But it seems that the accesses will have to be aligned for it to work.
@kainino: The MurmurHash problem does not reproduce for me with a chrome target build. The chrome target routing is pretty complicated:

chrome_binary("chrome_initial") > group("browser_dependencies") > //chrome/browser > //content/public/browser > //content/browser > //third_party/angle:angle_common

chrome_binary("chrome_initial") > group("child_dependencies") > //chrome/renderer > //third_party/smhasher:murmurhash3

There are also a number of other targets that chrome depends on indirectly that also depend on //chrome/browser, //chrome/renderer or //content/public/browser. It's probably just luck that this results in a different linker ordering or intermediate static library that avoids the problem.
Okay, I see. It sounds like putting ANGLE's copy of murmurhash in a namespace should be sufficient even if I can't verify. It's high on my todo list, but anyone else is welcome to take it if they want.
Cc: -phajdan.jr@chromium.org -thomasanderson@chromium.org -mmoss@chromium.org
Status: Started (was: Assigned)
Removing folks added in #27, I think we're happy with the solution.

Opened a patch to move our copy of murmurhash to a namespace.
jmadill:
> Right, so this might be a problem because the pool allocator doesn't do aligned allocations, and we're doing 32-bit reads from unaligned memory.

Could you explain why this is true? From looking at src/compiler/translator/PoolAlloc.cpp, it seems like the default alignment (16) is actually used.
For that default alignment of 16, is that 16 bytes or 16 bits? If the latter, it would probably have to be 32-bit aligned. If not, I'm not sure.
TPoolAllocator::TPoolAllocator says the following:
    //
    // Adjust alignment to be at least pointer aligned and
    // power of 2.
    //

So things should be at least aligned on 32bits, or 64.
Okay, I guess the alignment wasn't actually the issue. I'll see if I can get a more useful crash trace.
Here's something:

signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0xca4117a9

Which is indeed unaligned. Maybe either the hash implementation does unaligned accesses, or ANGLE is actually producing unaligned allocations.
Yeah, pretty sure it's ANGLE producing unaligned allocations. Maybe try changing the 16 to 32 to test and see if it fixes it.
Project Member

Comment 40 by bugdroid1@chromium.org, Sep 19 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/angle/angle/+/8b2142e3734e5403ba247deea4bdda1cdde207db

commit 8b2142e3734e5403ba247deea4bdda1cdde207db
Author: Kai Ninomiya <kainino@chromium.org>
Date: Tue Sep 19 00:28:40 2017

Put MurmurHash3 functions in the angle namespace

To prevent collisions when linking with other copies of MurmurHash3.

BUG=697758

Change-Id: Id8a5c709ba972812ffa3ca143e7553cbf05fc57a
Reviewed-on: https://chromium-review.googlesource.com/671194
Commit-Queue: Kai Ninomiya <kainino@chromium.org>
Reviewed-by: Jamie Madill <jmadill@chromium.org>

[modify] https://crrev.com/8b2142e3734e5403ba247deea4bdda1cdde207db/src/common/third_party/murmurhash/MurmurHash3.h
[modify] https://crrev.com/8b2142e3734e5403ba247deea4bdda1cdde207db/src/common/third_party/murmurhash/MurmurHash3.cpp

Project Member

Comment 41 by bugdroid1@chromium.org, Sep 19 2017

The following revision refers to this bug:
  https://skia.googlesource.com/skia/+/8e258318e68f93998d6f9a8fdbba34684a3d9483

commit 8e258318e68f93998d6f9a8fdbba34684a3d9483
Author: angle-deps-roller@chromium.org <angle-deps-roller@chromium.org>
Date: Tue Sep 19 01:53:18 2017

Roll skia/third_party/externals/angle2/ 10ce2d284..8b2142e37 (1 commit)

https://chromium.googlesource.com/angle/angle.git/+log/10ce2d284432..8b2142e3734e

$ git log 10ce2d284..8b2142e37 --date=short --no-merges --format='%ad %ae %s'
2017-09-18 kainino Put MurmurHash3 functions in the angle namespace

Created with:
  roll-dep skia/third_party/externals/angle2
BUG=697758


Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md

If the roll is causing failures, see:
http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls


CQ_INCLUDE_TRYBOTS=skia.primary:Perf-Win10-MSVC-AlphaR2-GPU-RadeonR9M470X-x86_64-Debug-ANGLE,Perf-Win10-MSVC-Golo-GPU-QuadroP400-x86_64-Debug-ANGLE,Perf-Win10-MSVC-NUC5i7RYH-GPU-IntelIris6100-x86_64-Debug-ANGLE,Perf-Win10-MSVC-NUC6i5SYK-GPU-IntelIris540-x86_64-Debug-ANGLE,Perf-Win10-MSVC-NUCD34010WYKH-GPU-IntelHD4400-x86_64-Debug-ANGLE,Perf-Win10-MSVC-ShuttleC-GPU-GTX960-x86_64-Debug-ANGLE,Test-Win10-MSVC-AlphaR2-GPU-RadeonR9M470X-x86_64-Debug-ANGLE,Test-Win10-MSVC-Golo-GPU-QuadroP400-x86_64-Debug-ANGLE,Test-Win10-MSVC-NUC5i7RYH-GPU-IntelIris6100-x86_64-Debug-ANGLE,Test-Win10-MSVC-NUC6i5SYK-GPU-IntelIris540-x86_64-Debug-ANGLE,Test-Win10-MSVC-NUCD34010WYKH-GPU-IntelHD4400-x86_64-Debug-ANGLE,Test-Win10-MSVC-ShuttleC-GPU-GTX960-x86_64-Debug-ANGLE
TBR=fmalita@google.com

Change-Id: I0b556d366f8d3a6168581843093d69b40786cdfe
Reviewed-on: https://skia-review.googlesource.com/48281
Reviewed-by: angle-deps-roller . <angle-deps-roller@chromium.org>
Commit-Queue: angle-deps-roller . <angle-deps-roller@chromium.org>

[modify] https://crrev.com/8e258318e68f93998d6f9a8fdbba34684a3d9483/DEPS

Project Member

Comment 42 by bugdroid1@chromium.org, Sep 19 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/add4f677d287530e01e62f3b8ec7967e4b063ca6

commit add4f677d287530e01e62f3b8ec7967e4b063ca6
Author: skia-deps-roller@chromium.org <skia-deps-roller@chromium.org>
Date: Tue Sep 19 03:12:11 2017

Roll src/third_party/skia/ c86d0794e..8e258318e (1 commit)

https://skia.googlesource.com/skia.git/+log/c86d0794ed37..8e258318e68f

$ git log c86d0794e..8e258318e --date=short --no-merges --format='%ad %ae %s'
2017-09-19 angle-deps-roller Roll skia/third_party/externals/angle2/ 10ce2d284..8b2142e37 (1 commit)

Created with:
  roll-dep src/third_party/skia
BUG=697758


Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md

If the roll is causing failures, see:
http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls


CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_trusty_blink_rel;master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel;master.tryserver.chromium.android:android_optional_gpu_tests_rel
TBR=fmalita@chromium.org

Change-Id: I23c180e9f9415729b04e82b733299c81f4ada579
Reviewed-on: https://chromium-review.googlesource.com/672143
Reviewed-by: Skia Deps Roller <skia-deps-roller@chromium.org>
Commit-Queue: Skia Deps Roller <skia-deps-roller@chromium.org>
Cr-Commit-Position: refs/heads/master@{#502771}
[modify] https://crrev.com/add4f677d287530e01e62f3b8ec7967e4b063ca6/DEPS

Project Member

Comment 43 by bugdroid1@chromium.org, Sep 19 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/f4a85e0ddd183a7bf7cdd0290875419085556fed

commit f4a85e0ddd183a7bf7cdd0290875419085556fed
Author: Geoff Lang <geofflang@chromium.org>
Date: Tue Sep 19 19:40:55 2017

Roll ANGLE 371ed53..abd3135

https://chromium.googlesource.com/angle/angle.git/+log/371ed53..abd3135

BUG=,None,chromium:507755,697758,chromium:759402,chromium:765363

TBR=jmadill@chromium.org

TEST=bots

CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.win:win_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.android:android_optional_gpu_tests_rel

Change-Id: Id871246203c86489e25aac00149b1590cdb5f3e0
Reviewed-on: https://chromium-review.googlesource.com/673403
Reviewed-by: Geoff Lang <geofflang@chromium.org>
Commit-Queue: Geoff Lang <geofflang@chromium.org>
Cr-Commit-Position: refs/heads/master@{#502921}
[modify] https://crrev.com/f4a85e0ddd183a7bf7cdd0290875419085556fed/DEPS

Project Member

Comment 44 by bugdroid1@chromium.org, Sep 29 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/angle/angle/+/8c5b69cbaa1084d42d2fd68e2781a78bd6dc6015

commit 8c5b69cbaa1084d42d2fd68e2781a78bd6dc6015
Author: Kai Ninomiya <kainino@chromium.org>
Date: Fri Sep 29 19:43:31 2017

Replace MurmurHash3 with PMurHash

PMurHash comes from the smhasher repository at chromium/src/third_party/smhasher

Bug: 697758
Change-Id: Id2859edf37ae66bf27509d53db7f22db8831fe44
Reviewed-on: https://chromium-review.googlesource.com/687970
Commit-Queue: Kai Ninomiya <kainino@chromium.org>
Reviewed-by: Jamie Madill <jmadill@chromium.org>

[add] https://crrev.com/8c5b69cbaa1084d42d2fd68e2781a78bd6dc6015/src/common/third_party/smhasher/src/PMurHash.cpp
[modify] https://crrev.com/8c5b69cbaa1084d42d2fd68e2781a78bd6dc6015/src/libANGLE/renderer/d3d/d3d11/InputLayoutCache.cpp
[modify] https://crrev.com/8c5b69cbaa1084d42d2fd68e2781a78bd6dc6015/src/libGLESv2.gypi
[delete] https://crrev.com/2a9e107c072e241a610a4496daac9590703e2fa0/src/common/third_party/murmurhash/MurmurHash3.cpp
[modify] https://crrev.com/8c5b69cbaa1084d42d2fd68e2781a78bd6dc6015/src/libANGLE/renderer/d3d/d3d11/RenderStateCache.cpp
[add] https://crrev.com/8c5b69cbaa1084d42d2fd68e2781a78bd6dc6015/src/common/third_party/smhasher/README.angle
[add] https://crrev.com/8c5b69cbaa1084d42d2fd68e2781a78bd6dc6015/src/common/third_party/smhasher/LICENSE
[delete] https://crrev.com/2a9e107c072e241a610a4496daac9590703e2fa0/src/common/third_party/murmurhash/LICENSE
[modify] https://crrev.com/8c5b69cbaa1084d42d2fd68e2781a78bd6dc6015/src/libANGLE/SizedMRUCache.h
[delete] https://crrev.com/2a9e107c072e241a610a4496daac9590703e2fa0/src/common/third_party/murmurhash/MurmurHash3.h
[add] https://crrev.com/8c5b69cbaa1084d42d2fd68e2781a78bd6dc6015/src/common/third_party/smhasher/src/PMurHash.h

Project Member

Comment 45 by bugdroid1@chromium.org, Oct 24 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/angle/angle/+/d4556dfc355a06b73d3ad64f2fea03f5aa0b1693

commit d4556dfc355a06b73d3ad64f2fea03f5aa0b1693
Author: Kai Ninomiya <kainino@chromium.org>
Date: Tue Oct 24 21:38:51 2017

Reland 'Adds TUnorderedMap and uses it for tLevel in TSymbolTableLevel.'

Reland of https://crrev.com/c/526672

Bug: 697758
Change-Id: I410e4774c4ad85595eb8789603901878b209c857
Reviewed-on: https://chromium-review.googlesource.com/688296
Reviewed-by: Jamie Madill <jmadill@chromium.org>
Commit-Queue: Kai Ninomiya <kainino@chromium.org>

[modify] https://crrev.com/d4556dfc355a06b73d3ad64f2fea03f5aa0b1693/src/compiler/translator/SymbolTable.h
[modify] https://crrev.com/d4556dfc355a06b73d3ad64f2fea03f5aa0b1693/src/compiler/translator/Common.h

Owner: ----
Status: Available (was: Started)
Forgot to unassign this when I finished relanding all of those changes. This work is available again.

Comment 47 by kbr@chromium.org, Jun 7 2018

Blockedon: 849576

Sign in to add a comment