compare_build_artifacts failing on chromium.linux/Deterministic Linux (dbg) due to remoting-webapp.v2.zip not being deterministic |
|||||||
Issue descriptionFiled by sheriff-o-matic@appspot.gserviceaccount.com on behalf of zmin@google.com compare_build_artifacts failing on chromium.linux/Deterministic Linux (dbg) Builders failed on: - Deterministic Linux (dbg): https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Deterministic%20Linux%20%28dbg%29 unexpected diffs: 2 Unexpected files with diffs: browser_tests.isolated remoting-webapp.v2.zip
,
Oct 31
Issue 900695 has been merged into this issue.
,
Oct 31
Hi Nico, Could you please take a look at this? Thanks.
,
Oct 31
I'm away for the next 2 weeks. A pnacl person should look at this. My guess is that this fails only flakily and that the next build will be fine.
,
Oct 31
Thanks. The error is not emergency as it is happened about 4 in 20 builds.
,
Nov 5
dschuff@ are you someone who can take a look or retriage? (Based on #4 saying a pnacl person should take a look) Highering the priority as long as this is not disabled or fixed.
,
Nov 5
Is there any documentation on how to reproduce this? The PNaCl compiler has not changed in many months, and due to the ongoing change from Buildbot to LUCI, the toolchain release builder is probably broken right now. So even assuming we find a tooclhain determinism bug that we can fix, I wouldn't expect an update in the timeframe you'd want for a P1 bug. Do you have a sense for when this started happening? i.e. was it really in the last 4 days, or is this from before?
,
Nov 5
Considering occasional failure of the builder, non-determinism seems to happen randomly. But it happened with following args.gn. ''' is_component_build = true is_debug = true strip_absolute_paths_from_debug_symbols = true symbol_level = 1 use_goma = true goma_dir = "/b/swarming/w/ir/cache/goma/client" # modify this. ''' Anyway, build failures are come from diff of remoting-webapp.v2.zip. So we need to make sure why remoting-webapp.v2.zip is different. https://logs.chromium.org/logs/chromium/buildbucket/cr-buildbucket.appspot.com/8930852789031777952/+/steps/compare_build_artifacts/0/logs/json.output/0 https://logs.chromium.org/logs/chromium/buildbucket/cr-buildbucket.appspot.com/8931123812272548272/+/steps/compare_build_artifacts/0/logs/json.output/0 https://logs.chromium.org/logs/chromium/buildbucket/cr-buildbucket.appspot.com/8931130030577484160/+/steps/compare_build_artifacts/0/logs/json.output/0 https://logs.chromium.org/logs/chromium/buildbucket/cr-buildbucket.appspot.com/8931268871280130624/+/steps/compare_build_artifacts/0/logs/json.output/0 https://logs.chromium.org/logs/chromium/buildbucket/cr-buildbucket.appspot.com/8931280146868347664/+/steps/compare_build_artifacts/0/logs/json.output/0 To keep builder green, let me add remoting-webapp.v2.zip to blacklist as workaround.
,
Nov 5
Looking at this again, I think the (p)nacl stuff is a red herring. remoting-webapp.v2.zip is different for some not yet understood reason, and then the script compares inputs of that file and that finds the pnacl obj files – but while those _are_ different (and should be fixed), they don't cause the difference in the final binaries that we send to swarming.
,
Nov 5
Here's a link to a failing build: https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Deterministic%20Linux%20%28dbg%29/3228
,
Nov 5
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/e315a84880804cf22f7773eac72b98ca26add094 commit e315a84880804cf22f7773eac72b98ca26add094 Author: Takuto Ikuta <tikuta@chromium.org> Date: Mon Nov 05 21:46:38 2018 Whitelist browser_tests.isolated and remoting-webapp.v2.zip for Linux Whitelist the files only for component_build. Bug: 900696 Change-Id: Ie85e9d963b6d28f8187abc2d7223d78c3851bc89 Reviewed-on: https://chromium-review.googlesource.com/c/1318510 Commit-Queue: Takuto Ikuta <tikuta@chromium.org> Reviewed-by: Nico Weber <thakis@chromium.org> Cr-Commit-Position: refs/heads/master@{#605478} [modify] https://crrev.com/e315a84880804cf22f7773eac72b98ca26add094/tools/determinism/deterministic_build_whitelist.pyl
,
Nov 6
With the whitelist added, I'll remove the sheriff label and let you guys sort out the priority :) thanks for the quick response!
,
Nov 27
agrieve made compare_build_artifacts.py print more information for zip file diffs, and that prints (https://logs.chromium.org/logs/chromium/buildbucket/cr-buildbucket.appspot.com/8928740646046079760/+/steps/compare_build_artifacts/0/stdout): remoting-webapp.v2.zip : DIFFERENT (expected): different size: 63053176 != 63053339 remoting-webapp.v2/remoting_client_plugin_newlib.pexe.debug: CRCs differ |
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by zmin@chromium.org
, Oct 31unexpected diffs: 2 Unexpected files with diffs: browser_tests.isolated remoting-webapp.v2.zip ninja: error: unknown target 'browser_tests.isolated' error to get graph for browser_tests.isolated: Command '['/b/swarming/w/ir/kitchen-checkout/depot_tools/ninja', '-C', '/b/swarming/w/ir/cache/builder/src/out/Release', '-t', 'graph', 'browser_tests.isolated']' returned non-zero exit status 1 ninja: error: unknown target 'browser_tests.isolated' error to get graph for browser_tests.isolated: Command '['/b/swarming/w/ir/kitchen-checkout/depot_tools/ninja', '-C', '/b/swarming/w/ir/cache/builder/src/out/Release', '-t', 'graph', 'browser_tests.isolated']' returned non-zero exit status 1 Checking browser_tests.isolated difference: (0 deps) Checking remoting-webapp.v2.zip difference: (2978 deps) newlib_pnacl/obj/net/net/parse_name.o : 6 out of 516460 bytes are different (0.00%) 0x10cc0 : cae4564d409bf409d026ba1d1c40b0db010000200200005030f9fa4a9bfb627b '..VM@....&...@..... ...P0..J..b{' cae4564d419bf409d026ba1d1c40b0db010000200200005030f9fa4a9bfb627b '..VMA....&...@..... ...P0..J..b{' ^ ^ 0x11000 : 0040b16a0adaa44f8036e9ede00002df0e0000001100008062d504b449a9006d '.@.j...O.6..............b...I..m' 0040b16a02daa44f8036e9ede00002df0e0000001100008062d504b449a9006d '.@.j...O.6..............b...I..m' ^ 0x110e0 : e8bee6d0dee4e8bee6d2f4ca564d419bd40bd026701d1c4040d7010000200200 '............VMA....&p..@@.... ..' e8bee6d0dee4e8bee6d2f4ca564d409bd40bd026701d1c4040d7010000200200 '............VM@....&p..@@.... ..' ^ ^ 0x11f60 : 08000040b16a02da645e8036e9ebe00002bf0e0000001100008062d504b449bd '...@.j..d^.6..............b...I.' 08000040b16a02da645e8036e9ebe00002bf0e0000001100008062d514b449bd '...@.j..d^.6..............b...I.' ^ 0x12b00 : 3407000040040000a05835056dd227409be8767000c16e0700008008000040b1 '4...@....X5.m.'@..vp..n.......@.' 3407000040040000a05835016dd227409be8767000c16e0700008008000040b1 '4...@....X5.m.'@..vp..n.......@.' ^ 0x12b80 : 84bc91955513000fea05c0430c060710c8600000008800eb0500c5aa0968933e '....U......C.....`...........h.>' 84bc91955513000fea05c0430c060710c8600000008800eb0500c5aa2968933e '....U......C.....`..........)h.>' ^ ^ newlib_pnacl/obj/base/base/string_piece.o : 2 out of 192752 bytes are different (0.00%) 0xbae0 : b930ba37b9b955132026ab038849b8070910887b000000880000001434c2e6e6 '.0.7..U. &...I.....{........4...' b930ba37b9b955532026ab038849b8070910887b000000880000001434c2e6e6 '.0.7..US &...I.....{........4...' ^ ^ 0xbc60 : 809880e2900041280e00008008000040b16a0ac46475003109f7200102710f00 '......A(.......@.j..du.1.. ..q..' 809880e2900041280e00008008000040b16a02c46475003109f7200102710f00 '......A(.......@.j..du.1.. ..q..' ^ newlib_pnacl/obj/base/base/string_util.o : 6 out of 767744 bytes are different (0.00%) 0x16d20 : 2b6dee8bedcded6cd5149ccae900a7f2cfc10104201d00000022000000c504e1 '+m.....l............ ...."......' 2b6dee8bedcded6cd5049ccae900a7f2cfc10104201d00000022000000c504e1 '+m.....l............ ...."......' ^ 0x17120 : 0000008800f703000589b9bb30b85513702af9039cca8d0e0710e8e800000088 '............0.U.p*..............' 0000008800f703000589b9bb30b85553702af9039cca8d0e0710e8e800000088 '............0.USp*..............' ^ ^ 0x17320 : 2b5b3505a79242c0a9f86c700081cf0600008008000040417a78117b99bb0b93 '+[5...B...lp..........@Azx.{....' 2b5b3501a79242c0a9f86c700081cf0600008008000040417a78117b99bb0b93 '+[5...B...lp..........@Azx.{....' ^ 0x175a0 : 523ac0a9e8767000c16e0700008008000040b16a024ee5748053f9e7e0000290 'R:...vp..n.......@.j.N.t.S......' 523ac0a9e8767000c16e0700008008000040b16a0a4ee5748053f9e7e0000290 'R:...vp..n.......@.j.N.t.S......' ^ 0x177a0 : 280000002240280140b16a0a4e257f8053b9d1e100021d1d0000001100008082 '(..."@(.@.j.N%..S...............' 280000002240280140b16a024e257f8053b9d1e100021d1d0000001100008082 '(..."@(.@.j.N%..S...............' ^ 0x17920 : 845142955513001efa0180474e0a0710e4a40000008800af0400c5aa09389514 '.QB.U......GN................8..' 845142955513001efa0180474e0a0710e4a40000008800af0400c5aa29389514 '.QB.U......GN...............)8..' ^ ^ step returned non-zero exit code: 1