New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 787242 link

Starred by 2 users

Issue metadata

Status: Verified
Owner:
OOO until 2019-01-24
Closed: Dec 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug

Blocked on:
issue 786603
issue 791331



Sign in to add a comment

win_optional_gpu_tests_rel compile failure

Project Member Reported by sunn...@chromium.org, Nov 21 2017

Issue description

From https://logs.chromium.org/v/?s=chromium%2Fbb%2Ftryserver.chromium.win%2Fwin_optional_gpu_tests_rel%2F17458%2F%2B%2Frecipes%2Fsteps%2Fcompile__with_patch_%2F0%2Fstdout

[1584/9145] LINK(DLL) win_clang_nacl_win64/libGLESv2.dll win_clang_nacl_win64/libGLESv2.dll.lib win_clang_nacl_win64/libGLESv2.dll.pdb
FAILED: win_clang_nacl_win64/libGLESv2.dll win_clang_nacl_win64/libGLESv2.dll.lib win_clang_nacl_win64/libGLESv2.dll.pdb 
E:/b/depot_tools/win_tools-2_7_6_bin/python/bin/python.exe ../../build/toolchain/win/tool_wrapper.py link-wrapper environment.x64 False link.exe /nologo /IMPLIB:win_clang_nacl_win64/libGLESv2.dll.lib /DLL /OUT:win_clang_nacl_win64/libGLESv2.dll /PDB:win_clang_nacl_win64/libGLESv2.dll.pdb @win_clang_nacl_win64/libGLESv2.dll.rsp
libANGLE.lib(renderer_utils.obj) : error LNK2019: unresolved external symbol "public: class gl::Error __cdecl gl::Texture::setStorageMultisample(class gl::Context const *,unsigned int,int,int,struct gl::Extents const &,unsigned char)" (?setStorageMultisample@Texture@gl@@QEAA?AVError@2@PEBVContext@2@IHHAEBUExtents@2@E@Z) referenced in function "public: class gl::Error __cdecl rx::IncompleteTextureSet::getIncompleteTexture(class gl::Context const *,unsigned int,class rx::MultisampleTextureInitializer *,class gl::Texture * *)" (?getIncompleteTexture@IncompleteTextureSet@rx@@QEAA?AVError@gl@@PEBVContext@4@IPEAVMultisampleTextureInitializer@2@PEAPEAVTexture@4@@Z)
win_clang_nacl_win64/libGLESv2.dll : fatal error LNK1120: 1 unresolved externals

There are similar failures in the last couple of dozen builds:

https://build.chromium.org/p/tryserver.chromium.win/builders/win_optional_gpu_tests_rel?numbuilds=200
 
I think something's up with the compile dependencies of the optional builders.. there's no equivalent compile failure on the gpu.fyi bots, and the changes passed the CQ to land. This seems similar to  issue 786603 . We might have to dig into it.

Comment 2 by kbr@chromium.org, Nov 21 2017

Blockedon: 786603
Cc: h...@chromium.org dpranke@chromium.org geoffl...@chromium.org thakis@chromium.org jmad...@chromium.org
Components: Infra>Client>Chrome Infra>Goma Build
The main difference between this trybot:
https://ci.chromium.org/buildbot/tryserver.chromium.win/win_optional_gpu_tests_rel/?limit=200

and the combination of this builder and tester:
https://ci.chromium.org/buildbot/chromium.gpu.fyi/GPU%20Win%20Builder/?limit=200
https://ci.chromium.org/buildbot/chromium.gpu.fyi/Win7%20Release%20%28NVIDIA%29/?limit=200

and the ANGLE tryservers like this one:
https://ci.chromium.org/buildbot/tryserver.chromium.angle/win_angle_rel_ng/?limit=200

is that win_optional_gpu_tests_rel shares its builder pool with a lot of other trybot configurations.

1) They all build with top-of-tree ANGLE
2) Presumably, they're all repeatedly checking out ANGLE at different revisions (both backward and forward in time) all the time
3) They all do the statically-linked build, not component build

I looked at a few of the compile failures:
https://ci.chromium.org/buildbot/tryserver.chromium.win/win_optional_gpu_tests_rel/17456
https://ci.chromium.org/buildbot/tryserver.chromium.win/win_optional_gpu_tests_rel/17454
https://ci.chromium.org/buildbot/tryserver.chromium.win/win_optional_gpu_tests_rel/17453

and the associated buildslaves that compiled them (see a few attached screenshots):

https://build.chromium.org/p/tryserver.chromium.win/buildslaves/vm162-m4
https://build.chromium.org/p/tryserver.chromium.win/buildslaves/vm183-m4
https://build.chromium.org/p/tryserver.chromium.win/buildslaves/vm333-m4
https://build.chromium.org/p/tryserver.chromium.win/buildslaves/vm402-m4

that the failing builds are all preceded by tryjobs from the win7_chromium_rel_loc_exp trybot:
https://ci.chromium.org/buildbot/tryserver.chromium.win/win7_chromium_rel_loc_exp/

Looking at that trybot's definition in trybots.py and CrWinGoma(loc) in mb_config.pyl, it's doing the shared library build, while the optional tryservers do the statically linked build.

Could the switch between these two configurations not be being handled correctly? It essentially needs to force a full rebuild. This is the main difference between win_optional_gpu_tests_rel and the ANGLE tryservers, which have their own pool of builders.

Could I ask for some help on this?

(Per tools/build/masters/master.tryserver.chromium.win/slaves.cfg, the cq_slaves pool is quite large, so carving off win_optional_gpu_tests_rel into its own pool for reliability doesn't seem workable. Also, builders are supposed to be able to build any configuration at any time, so it seems we need to solve the root cause of this problem.)

Screen Shot 2017-11-20 at 9.29.08 PM.png
236 KB View Download
Screen Shot 2017-11-20 at 9.31.47 PM.png
317 KB View Download
Screen Shot 2017-11-20 at 9.32.12 PM.png
250 KB View Download

Comment 3 by kbr@chromium.org, Nov 21 2017

Cc: machenb...@chromium.org
+machenbach in case this is affecting V8 rolls

There was an issue open where someone was suggesting instead of using Debug/Release folders we use the name of the bot.. just wanted to note that in this issue, can look up the other issue tomorrow.
Looks like it was my CL that caused this particular issue (https://chromium-review.googlesource.com/c/angle/angle/+/779799).  I can revert it if needed.
I can't find thee issue I was thinking of in comment #4. But I believe dpranke and others were cc'ed on that issue. Moving to a separate build directory per-builder, or only maybe somehow better separating compile flags into categories, might make sense.

Geoff, I think an immediate fix would be to clean the output dirs on the builders, but I don't think you need to revert your CL. It's not the CL's fault.
So, a few different things here ...

1) We need to stop using CQ_INCLUDE_TRYBOTS in PRESUBMITs to add things, people in Ops who are managing capacity have no visibility into this. We should either add these bots unconditionally and let analyze do the work, or modify the JSON formats to specify filters.

2) It seems really wrong to me that we'd have patches intended to be committed into Chrome that would be tested against tip-of-tree ANGLE rather than the version specified in DEPS. That means try jobs would be rejected by configurations that non-angle devs don't see.

3) Generally speaking, I want us to move to a world where we have pool-per-builder, rather than the big shared windows pools. So, if we need to carve out a pool for win_optional_gpu_tests_rel we should just do that.

4) Failing that, we can configure the builders to have separate checkout directories so that they don't stomp on each other, but that takes up extra disk space.

5) Apart from *all* of that, theoretically the build should be able to rebuild with arbitrary different GN args and source file changes into the same directory (at the cost of a lot of clobbering), but I wouldn't be surprised if we hit bugs from time to time.

So, all that said, which parts of this do you want help with?

Comment 8 by kbr@chromium.org, Nov 22 2017

Status: Started (was: Assigned)
Dirk, thanks for your comments. I'll be happy to work with you on (1) and (2); there are valid reasons for both, but they can be rethought.

At the immediate moment I appreciate your willingness to do (3), and have uploaded https://chromium-review.googlesource.com/784317 splitting win_optional_gpu_tests_rel into its own pool. If we can do that now I'll gladly follow up on the other issues.

Regarding #2, I agree that the optional builders should not use ToT ANGLE. I think maybe the original reason was for better testing of WebGL rolls? In any case ANGLE ToT should not be that far behind DEPS ANGLE any more with the auto-roller working well now.

Comment 10 by kbr@chromium.org, Nov 22 2017

It is not easy to make the optional builders avoid using top-of-tree ANGLE. We don't have any bots that run the WebGL 2.0 conformance tests with the DEPS version of ANGLE, and it wouldn't be responsible to have a tryserver which doesn't exactly mirror a waterfall bot. I'm happy to discuss this more in depth with anyone, but we have real machine constraints and I don't want to add a spurious configuration that won't add much value.

I guess we could try and diagnose the build dependency error by doing the same thing the builders so - build chrome one way, change the config, and build again in the same folder.
Project Member

Comment 12 by bugdroid1@chromium.org, Nov 22 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/build/+/10d6e6fd9350083073fc9247f1d621bc02376f34

commit 10d6e6fd9350083073fc9247f1d621bc02376f34
Author: Kenneth Russell <kbr@chromium.org>
Date: Wed Nov 22 22:50:38 2017

Split win_optional_gpu_tests_rel into its own pool.

There appear to be bugs in the build dependencies when switching
between component and statically-linked builds, which seemed to be
happening because win7_chromium_rel_loc_exp was sharing these
slaves. Work around the problem by having dedicated slaves for the
win_optional_gpu_tests_rel tryserver.

BUG= 787242 

Change-Id: Iff9b96408eb8792a7b1096f0a3b2e781a701555f
Reviewed-on: https://chromium-review.googlesource.com/784317
Reviewed-by: Dirk Pranke <dpranke@chromium.org>
Commit-Queue: Kenneth Russell <kbr@chromium.org>

[modify] https://crrev.com/10d6e6fd9350083073fc9247f1d621bc02376f34/masters/master.tryserver.chromium.win/slaves.cfg

Project Member

Comment 13 by bugdroid1@chromium.org, Nov 23 2017

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infradata/master-manager/+/e54e85b4910cff0d1799dd3754f0d32d2fe77b44

commit e54e85b4910cff0d1799dd3754f0d32d2fe77b44
Author: Kenneth Russell <kbr@google.com>
Date: Thu Nov 23 00:03:05 2017

Following up on #c7 - #c10, we'll leave things the way they are for the moment. 

I would like to move this sort of logic out of the PRESUBMIT checks and into the //testing/buildbot files, so that it can be in one (more obvious) place, though I would prefer to do that only after we've gotten rid of the JSON and replaced it with something that doesn't have so much duplication.

Comment 15 by kbr@chromium.org, Dec 3 2017

Blockedon: 791331

Comment 16 by kbr@chromium.org, Dec 3 2017

Thanks for your feedback Dirk.

I would have closed this as Fixed but one of the VMs in the new pool seems to be failing compiles all the time. Until that's resolved this problem still basically exists.

> one of the VMs in the new pool seems to be failing compiles all the time.

Which VM, and has it been taken out of rotation and fixed yet, or do you need someone to do that?

Comment 18 by kbr@chromium.org, Dec 4 2017

It was handled in Issue 791331. The re-imaged VM hasn't run a job yet so I'm waiting for it to do so before closing this.

Comment 19 by kbr@chromium.org, Dec 5 2017

Status: Verified (was: Started)
The VM's had a few green builds:
https://build.chromium.org/deprecated/tryserver.chromium.win/buildslaves/vm497-m4

Closing.

Sign in to add a comment