Chromedriver Mac bot failing randomly |
|||||||||||
Issue descriptionWe are seeing some random failures on Chromedriver Mac bot since yesterday evening. https://build.chromium.org/p/chromium.chromedriver/builders/Mac%2010.6/builds/12065 When executed these tests locally, there are no failures. This confirms that these are not real bugs. May be something with the way bot is building & executing the tests? Also, I see "fatal: Needed a single revision" during git rev-parse in https://build.chromium.org/p/chromium.chromedriver/builders/Mac%2010.6/builds/12065/steps/steps/logs/preamble Could this be related?
,
Mar 2 2017
Bumping up the priority since we need this bot working today. As an aside, the blamelist is useful for finding culprit CLs if they land in the chromium repo, but not if they land in one of the infra repos. Is there some kind of "infra blamelist" that lists infra changes that are new for the current build?
,
Mar 2 2017
,
Mar 2 2017
Robbie, can you take a look at this?
,
Mar 3 2017
The same issue is now also seen on Linux32 bot. https://build.chromium.org/p/chromium.chromedriver/builders/Linux32/builds/22228 Execution of send_ts_mon_values failed with code 1 Traceback (most recent call last): File "/opt/infra-python/run.py", line 29, in <module> sys.exit(run_helper.run_py_main(sys.argv[1:], RUNPY_PATH, ENV_PATH, 'infra')) File "/opt/infra-python/bootstrap/run_helper.py", line 125, in run_py_main runpy.run_module(args[0], run_name='__main__', alter_sys=True) File "/usr/lib/python2.7/runpy.py", line 176, in run_module fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 82, in _run_module_code mod_name, mod_fname, mod_loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/opt/infra-python/.cipd/pkgs/infra_python_linux-amd64-ubuntu14_04_tap59FFGuW/b297600ab25f61d73f309de224ea66b68cfdda47/infra/tools/send_ts_mon_values/__main__.py", line 13, in <module> sys.exit(common.main(sys.argv[1:])) File "/opt/infra-python/.cipd/pkgs/infra_python_linux-amd64-ubuntu14_04_tap59FFGuW/b297600ab25f61d73f309de224ea66b68cfdda47/infra/tools/send_ts_mon_values/common.py", line 307, in main set_metrics(arg, metric) File "/opt/infra-python/.cipd/pkgs/infra_python_linux-amd64-ubuntu14_04_tap59FFGuW/b297600ab25f61d73f309de224ea66b68cfdda47/infra/tools/send_ts_mon_values/common.py", line 168, in set_metrics return [set_metric(md, metric_type) for md in collapsed_metrics if md] File "/opt/infra-python/.cipd/pkgs/infra_python_linux-amd64-ubuntu14_04_tap59FFGuW/b297600ab25f61d73f309de224ea66b68cfdda47/infra/tools/send_ts_mon_values/common.py", line 252, in set_metric field_spec.append(field_ctor(value_cast(name))) ValueError: invalid literal for long() with base 10: 'clobber'
,
Mar 3 2017
I'm not sure about the linux32 bot, but on the mac bot it looks like Chrome is crashing (failed CHECK) due to a very long socket path: [0.134][INFO]: Launching chrome: /var/folders/p4/py6nlvcx4fs5s3fg51_pzqmc0000gm/T/chromedriver_naRFBb/chromedriver_roWqhA/chrome-mac/Chromium.app/Contents/MacOS/Chromium --disable-background-networking --disable-client-side-phishing-detection --disable-default-apps --disable-hang-monitor --disable-popup-blocking --disable-prompt-on-repost --disable-sync --disable-web-resources --enable-automation --enable-logging --ignore-certificate-errors --load-component-extension=/var/folders/p4/py6nlvcx4fs5s3fg51_pzqmc0000gm/T/chromedriver_naRFBb/.org.chromium.Chromium.FYlmEq/internal --log-level=0 --metrics-recording-only --no-first-run --password-store=basic --remote-debugging-port=12577 --safebrowsing-disable-auto-update --test-type=webdriver --use-mock-keychain --user-data-dir=/var/folders/p4/py6nlvcx4fs5s3fg51_pzqmc0000gm/T/chromedriver_naRFBb/.org.chromium.Chromium.Wxn34L data:, .... [15403:775:0303/132611.894909:FATAL:process_singleton_posix.cc(223)] Check failed: path.length() < arraysize(addr->sun_path). Socket path too long: /var/folders/p4/py6nlvcx4fs5s3fg51_pzqmc0000gm/T/chromedriver_naRFBb/.org.chromium.Chromium.wm9Z53/SingletonSocket "/var/folders/p4/py6nlvcx4fs5s3fg51_pzqmc0000gm/T/chromedriver_naRFBb/.org.chromium.Chromium.wm9Z53/SingletonSocket" is 114 chars long I think arraysize(addr->sun_path) == 104 according to https://developer.apple.com/legacy/library/documentation/Darwin/Reference/ManPages/man4/unix.4.html So yeah, 114 > 104. The user data dir is in a temp dir, does anyone know why the paths are so long now? For comparison, from one of the previous builds I see that the user data dir was put in a different location (with a much shorter path): --user-data-dir=/tmp/chromedriver_Mn8oi2/.org.chromium.Chromium.ys8WDi from http://chromedriver-data.storage.googleapis.com/server_logs/chromedriver_log_7HSiuO
,
Mar 3 2017
,
Mar 4 2017
Here's a CL to fix: https://chromium-review.googlesource.com/c/449066/ This should change the TMPDIR to something like /b/rr/Nj65pQS5fn, which I think should fix this bug. Since it's late on a friday though, I'll land this Monday morning (PST), if that's alright.
,
Mar 6 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/build/+/93422fb16bcbe181d827f18db65c86f375a1f885 commit 93422fb16bcbe181d827f18db65c86f375a1f885 Author: Robert Iannucci <iannucci@chromium.org> Date: Mon Mar 06 19:57:12 2017 [*_run.py] set $TEMP/$TMPDIR for {annotated,remote}_run.py. BUG= 697941 Change-Id: I6313c27c46af4ad5cfd904533d54bf4a7dafae8d Reviewed-on: https://chromium-review.googlesource.com/449066 Commit-Queue: Robbie Iannucci <iannucci@chromium.org> Reviewed-by: Daniel Jacques <dnj@chromium.org> [modify] https://crrev.com/93422fb16bcbe181d827f18db65c86f375a1f885/scripts/slave/robust_tempdir.py [modify] https://crrev.com/93422fb16bcbe181d827f18db65c86f375a1f885/scripts/slave/annotated_run.py [modify] https://crrev.com/93422fb16bcbe181d827f18db65c86f375a1f885/scripts/slave/remote_run.py [modify] https://crrev.com/93422fb16bcbe181d827f18db65c86f375a1f885/scripts/slave/unittests/remote_run_test.py
,
Mar 6 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/build/+/93422fb16bcbe181d827f18db65c86f375a1f885 commit 93422fb16bcbe181d827f18db65c86f375a1f885 Author: Robert Iannucci <iannucci@chromium.org> Date: Mon Mar 06 19:57:12 2017 [*_run.py] set $TEMP/$TMPDIR for {annotated,remote}_run.py. BUG= 697941 Change-Id: I6313c27c46af4ad5cfd904533d54bf4a7dafae8d Reviewed-on: https://chromium-review.googlesource.com/449066 Commit-Queue: Robbie Iannucci <iannucci@chromium.org> Reviewed-by: Daniel Jacques <dnj@chromium.org> [modify] https://crrev.com/93422fb16bcbe181d827f18db65c86f375a1f885/scripts/slave/robust_tempdir.py [modify] https://crrev.com/93422fb16bcbe181d827f18db65c86f375a1f885/scripts/slave/annotated_run.py [modify] https://crrev.com/93422fb16bcbe181d827f18db65c86f375a1f885/scripts/slave/remote_run.py [modify] https://crrev.com/93422fb16bcbe181d827f18db65c86f375a1f885/scripts/slave/unittests/remote_run_test.py
,
Mar 6 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/build/+/28b65c3651b9fbf1d8a8e08d4534714748c98d44 commit 28b65c3651b9fbf1d8a8e08d4534714748c98d44 Author: Robbie Iannucci <iannucci@chromium.org> Date: Mon Mar 06 21:22:26 2017 Revert "[*_run.py] set $TEMP/$TMPDIR for {annotated,remote}_run.py." This reverts commit 93422fb16bcbe181d827f18db65c86f375a1f885. Reason for revert: failure related to cloudtail: https://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_rel_ng/builds/394942/steps/Recipe%20engine%20bug/logs/stdio Original change's description: > [*_run.py] set $TEMP/$TMPDIR for {annotated,remote}_run.py. > > BUG= 697941 > > Change-Id: I6313c27c46af4ad5cfd904533d54bf4a7dafae8d > Reviewed-on: https://chromium-review.googlesource.com/449066 > Commit-Queue: Robbie Iannucci <iannucci@chromium.org> > Reviewed-by: Daniel Jacques <dnj@chromium.org> > TBR=iannucci@chromium.org,dnj@chromium.org,chromium-reviews@chromium.org NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG= 697941 Change-Id: I641a07f7d75cf874b849ef5cdb69df4cd888d604 Reviewed-on: https://chromium-review.googlesource.com/450598 Reviewed-by: Robbie Iannucci <iannucci@chromium.org> Commit-Queue: Robbie Iannucci <iannucci@chromium.org> [modify] https://crrev.com/28b65c3651b9fbf1d8a8e08d4534714748c98d44/scripts/slave/robust_tempdir.py [modify] https://crrev.com/28b65c3651b9fbf1d8a8e08d4534714748c98d44/scripts/slave/annotated_run.py [modify] https://crrev.com/28b65c3651b9fbf1d8a8e08d4534714748c98d44/scripts/slave/remote_run.py [modify] https://crrev.com/28b65c3651b9fbf1d8a8e08d4534714748c98d44/scripts/slave/unittests/remote_run_test.py
,
Mar 6 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/bb0e33882bce1a7dc0b1a2c27149909bb755052c commit bb0e33882bce1a7dc0b1a2c27149909bb755052c Author: samuong <samuong@chromium.org> Date: Mon Mar 06 21:53:36 2017 [chromedriver] Remove _CleanTmpDir and use $TMPDIR directly on bots. Previously we were creating a subdirectory under $TMPDIR so that it could be easily cleaned up by _CleanTmpDir (see crbug.com/598366 for context). We'll be getting a new unique temp dir per build once crbug.com/697941 is fixed. This makes our _CleanTmpDir function redundant, so I'm removing it. BUG= 697941 Review-Url: https://codereview.chromium.org/2736433002 Cr-Commit-Position: refs/heads/master@{#454969} [modify] https://crrev.com/bb0e33882bce1a7dc0b1a2c27149909bb755052c/chrome/test/chromedriver/run_buildbot_steps.py
,
Mar 6 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/build/+/4236db6cb072c55f9483bd670fdfd9acfd3a95a1 commit 4236db6cb072c55f9483bd670fdfd9acfd3a95a1 Author: Robert Iannucci <iannucci@chromium.org> Date: Mon Mar 06 22:22:27 2017 Reland "[*_run.py] set $TEMP/$TMPDIR for {annotated,remote}_run.py." Reland "[robust_tmpdir.py] set $TMP too" Originally this only set $TEMP and $TMPDIR. Unfortunately that meant that on windows $TMP had a different value than both. Most of the time that's OK (but silly), however the goma recipe module in depot_tools ends up using BOTH $TMPDIR and $TMP on windows, leading to path confusion. This reland won't transition through the intermediate state. BUG= 697941 ,698851 Change-Id: Ie0469fa5d420c10ec8ad2d2fa045d3b929b223a1 Reviewed-on: https://chromium-review.googlesource.com/450697 Reviewed-by: Daniel Jacques <dnj@chromium.org> Commit-Queue: Robbie Iannucci <iannucci@chromium.org> [modify] https://crrev.com/4236db6cb072c55f9483bd670fdfd9acfd3a95a1/scripts/slave/robust_tempdir.py [modify] https://crrev.com/4236db6cb072c55f9483bd670fdfd9acfd3a95a1/scripts/slave/annotated_run.py [modify] https://crrev.com/4236db6cb072c55f9483bd670fdfd9acfd3a95a1/scripts/slave/remote_run.py [modify] https://crrev.com/4236db6cb072c55f9483bd670fdfd9acfd3a95a1/scripts/slave/unittests/remote_run_test.py
,
Mar 17 2017
Thanks Robbie & Sam for fixing the Mac bot issue. Mac bot is running fine since then. However, I noticed that Chromedriver logs are no longer archived on the desktop bots. It displays "This build is configured to send log data exclusively to LogDog. Please click the LogDog link on the build page to view this log stream." https://build.chromium.org/p/chromium.chromedriver/builders/Mac%2010.6/builds/12264/steps/archive%20chromedriver%20server%20logs/logs/stdio I am not sure how to get the logs from logdog :( Robbie, could you please point to any docs on how to use logdog?
,
Mar 24 2017
Dan is the logdog guy. I'm not sure why there's no link on that builder page though.
,
Mar 24 2017
Oh, interesting. This suggests that the step didn't actually produce logs: https://screenshot.googleplex.com/2j9B63Qu0rB Good news is that you're not missing anything. Bad news is that you hit an edge case in my template hack. I'm going to fix that elsewhere.
,
Mar 24 2017
Marking Fixed, since the actual underlying problem is resolved.
,
Mar 24 2017
And also all new builds are correctly linking to logdog too; I think this particular build just got caught by a couple different things at the same time.
,
Mar 27 2017
Sorry, not sure if I am looking at the correct subpage. But I still don't see any logdog link at any of the Chromedriver desktop bots. https://build.chromium.org/p/chromium.chromedriver/builders/Linux/builds/26451/steps/archive%20chromedriver%20server%20logs/logs/stdio Also, not sure if we are on same page, I am referring to chromedriver server 'verbose' logs not being archived. Chromedriver test execution output log is correctly displayed. 1) Execution logs are available at the sublinks under each test run. For example - https://luci-logdog.appspot.com/v/?s=chromium%2Fbb%2Fchromium.chromedriver%2FLinux%2F26373%2F%2B%2Frecipes%2Fsteps%2Fpython_tests_v56_%2F0%2Fstdout But 2) Chromedriver verbose logs are not available under 'archive chromedriver server logs' step for desktop bots. whereas, for chromedriver android bot, chromedriver verbose logs are successfully archived. Please see the attached screenshot.
,
Mar 27 2017
I'm not entirely sure what the problem here is. I looked the build. All of the links are LogDog links, as they should be. 1) "Execution logs are not available". The build that you linked doesn't have a "python_tests chromium" step. It has other nested steps (goma), and LogDog links are working properly for them. The build that you linked does not emit nested steps (see "STEP_NEST_LEVEL annotations, only are emitted for goma steps). If it was supposed to, this is a bug in your recipe, not LogDog. 2) "Chromedriver verbose logs are not available". I covered this in #16: there are no logs for that step. If there was a BuildBot link, it would just be empty. An "stdio" link showing that references a non-existent LogDog stream is an unfortunate edge case in the template logic, and it is non-trivial to fix this edge case. If you wouldn't mind, could you just ignore it?
,
Mar 27 2017
The 'archive chromedriver server logs' step just copies the chromedriver log files to google storage - https://cs.chromium.org/chromium/src/chrome/test/chromedriver/run_buildbot_steps.py?q=run_buildbot_steps.py+package:%5Echromium$&l=61 I think LogDog takes the stdout(/stderr?) stream and logs those, so this might not be exactly related. It did, however, break at around the same time. Is there any chance that your LogDog changes would break upload to GS?
,
Mar 27 2017
It seems unlikely that the switch to LogDog-only mode would have broken your archive step. LogDog reads the STDOUT/STDERR data and creates streams for them. It does, however, do this lazily, meaning that it creates an "stdout" stream when data is observed on STDOUT. If a step never sends STDOUT data, a LogDog stream for that data will never be emitted.
,
Mar 27 2017
It's probably not due to LogDog then. This probably regressed when we messed around with TMPDIR recently. I took a look at the code for that step, and I think I've found the problem. https://codereview.chromium.org/2777233002 might fix it, and if not it will add some logging which could give us some clues about why it's failing.
,
Mar 27 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/4347e0851af0acb9c6e0d5c17cec03ef6a167749 commit 4347e0851af0acb9c6e0d5c17cec03ef6a167749 Author: samuong <samuong@chromium.org> Date: Mon Mar 27 20:04:07 2017 [chromedriver] Fix path for server logs and add logging. BUG= 697941 TBR=stgao@chromium.org Review-Url: https://codereview.chromium.org/2777233002 Cr-Commit-Position: refs/heads/master@{#459862} [modify] https://crrev.com/4347e0851af0acb9c6e0d5c17cec03ef6a167749/chrome/test/chromedriver/run_buildbot_steps.py
,
Mar 27 2017
Dan, thanks for your time. Sam, thanks for the fix. Chromedriver verbose logs are now successfully being archived. https://build.chromium.org/p/chromium.chromedriver/builders/Linux/builds/26457 |
|||||||||||
►
Sign in to add a comment |
|||||||||||
Comment 1 by gmanikpure@chromium.org
, Mar 2 2017