New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 697941 link

Starred by 1 user

Issue metadata

Status: Verified
Owner:
Last visit > 30 days ago
Closed: Mar 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux , Mac
Pri: 1
Type: Bug


Show other hotlists

Hotlists containing this issue:
Hotlist-1


Sign in to add a comment

Chromedriver Mac bot failing randomly

Project Member Reported by gmanikpure@chromium.org, Mar 2 2017

Issue description

We are seeing some random failures on Chromedriver Mac bot since yesterday evening.
https://build.chromium.org/p/chromium.chromedriver/builders/Mac%2010.6/builds/12065

When executed these tests locally, there are no failures. This confirms that these are not real bugs. May be something with the way bot is building & executing the tests?

Also, I see "fatal: Needed a single revision" during git rev-parse in https://build.chromium.org/p/chromium.chromedriver/builders/Mac%2010.6/builds/12065/steps/steps/logs/preamble
Could this be related?
 
In addition to the above, observed below stacktrace in https://build.chromium.org/p/chromium.chromedriver/builders/Mac%2010.6/builds/12065/steps/steps/logs/stdio

Execution of send_ts_mon_values failed with code 1
Traceback (most recent call last):
  File "/opt/infra-python/run.py", line 29, in <module>
    sys.exit(run_helper.run_py_main(sys.argv[1:], RUNPY_PATH, ENV_PATH, 'infra'))
  File "/opt/infra-python/bootstrap/run_helper.py", line 125, in run_py_main
    runpy.run_module(args[0], run_name='__main__', alter_sys=True)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 176, in run_module
    fname, loader, pkg_name)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 82, in _run_module_code
    mod_name, mod_fname, mod_loader, pkg_name)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/opt/infra-python/.cipd/pkgs/infra_python_mac-amd64-mac10_12_4Ey_NgujxD/5f7250706d9436aa4467131f8b169e2b16f5e4a3/infra/tools/send_ts_mon_values/__main__.py", line 13, in <module>
    sys.exit(common.main(sys.argv[1:]))
  File "/opt/infra-python/.cipd/pkgs/infra_python_mac-amd64-mac10_12_4Ey_NgujxD/5f7250706d9436aa4467131f8b169e2b16f5e4a3/infra/tools/send_ts_mon_values/common.py", line 307, in main
    set_metrics(arg, metric)
  File "/opt/infra-python/.cipd/pkgs/infra_python_mac-amd64-mac10_12_4Ey_NgujxD/5f7250706d9436aa4467131f8b169e2b16f5e4a3/infra/tools/send_ts_mon_values/common.py", line 168, in set_metrics
    return [set_metric(md, metric_type) for md in collapsed_metrics if md]
  File "/opt/infra-python/.cipd/pkgs/infra_python_mac-amd64-mac10_12_4Ey_NgujxD/5f7250706d9436aa4467131f8b169e2b16f5e4a3/infra/tools/send_ts_mon_values/common.py", line 252, in set_metric
    field_spec.append(field_ctor(value_cast(name)))
ValueError: invalid literal for long() with base 10: 'clobber'
Labels: -Pri-3 Pri-1
Bumping up the priority since we need this bot working today.

As an aside, the blamelist is useful for finding culprit CLs if they land in the chromium repo, but not if they land in one of the infra repos. Is there some kind of "infra blamelist" that lists infra changes that are new for the current build?
Owner: andyb...@chromium.org
Status: Started (was: Untriaged)
Owner: iannucci@chromium.org
Status: Assigned (was: Started)
Robbie, can you take a look at this?
Labels: OS-Linux
The same issue is now also seen on Linux32 bot.
https://build.chromium.org/p/chromium.chromedriver/builders/Linux32/builds/22228


Execution of send_ts_mon_values failed with code 1
Traceback (most recent call last):
  File "/opt/infra-python/run.py", line 29, in <module>
    sys.exit(run_helper.run_py_main(sys.argv[1:], RUNPY_PATH, ENV_PATH, 'infra'))
  File "/opt/infra-python/bootstrap/run_helper.py", line 125, in run_py_main
    runpy.run_module(args[0], run_name='__main__', alter_sys=True)
  File "/usr/lib/python2.7/runpy.py", line 176, in run_module
    fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 82, in _run_module_code
    mod_name, mod_fname, mod_loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/opt/infra-python/.cipd/pkgs/infra_python_linux-amd64-ubuntu14_04_tap59FFGuW/b297600ab25f61d73f309de224ea66b68cfdda47/infra/tools/send_ts_mon_values/__main__.py", line 13, in <module>
    sys.exit(common.main(sys.argv[1:]))
  File "/opt/infra-python/.cipd/pkgs/infra_python_linux-amd64-ubuntu14_04_tap59FFGuW/b297600ab25f61d73f309de224ea66b68cfdda47/infra/tools/send_ts_mon_values/common.py", line 307, in main
    set_metrics(arg, metric)
  File "/opt/infra-python/.cipd/pkgs/infra_python_linux-amd64-ubuntu14_04_tap59FFGuW/b297600ab25f61d73f309de224ea66b68cfdda47/infra/tools/send_ts_mon_values/common.py", line 168, in set_metrics
    return [set_metric(md, metric_type) for md in collapsed_metrics if md]
  File "/opt/infra-python/.cipd/pkgs/infra_python_linux-amd64-ubuntu14_04_tap59FFGuW/b297600ab25f61d73f309de224ea66b68cfdda47/infra/tools/send_ts_mon_values/common.py", line 252, in set_metric
    field_spec.append(field_ctor(value_cast(name)))
ValueError: invalid literal for long() with base 10: 'clobber'
I'm not sure about the linux32 bot, but on the mac bot it looks like Chrome is crashing (failed CHECK) due to a very long socket path:

[0.134][INFO]: Launching chrome: /var/folders/p4/py6nlvcx4fs5s3fg51_pzqmc0000gm/T/chromedriver_naRFBb/chromedriver_roWqhA/chrome-mac/Chromium.app/Contents/MacOS/Chromium --disable-background-networking --disable-client-side-phishing-detection --disable-default-apps --disable-hang-monitor --disable-popup-blocking --disable-prompt-on-repost --disable-sync --disable-web-resources --enable-automation --enable-logging --ignore-certificate-errors --load-component-extension=/var/folders/p4/py6nlvcx4fs5s3fg51_pzqmc0000gm/T/chromedriver_naRFBb/.org.chromium.Chromium.FYlmEq/internal --log-level=0 --metrics-recording-only --no-first-run --password-store=basic --remote-debugging-port=12577 --safebrowsing-disable-auto-update --test-type=webdriver --use-mock-keychain --user-data-dir=/var/folders/p4/py6nlvcx4fs5s3fg51_pzqmc0000gm/T/chromedriver_naRFBb/.org.chromium.Chromium.Wxn34L data:,
....
[15403:775:0303/132611.894909:FATAL:process_singleton_posix.cc(223)] Check failed: path.length() < arraysize(addr->sun_path). Socket path too long: /var/folders/p4/py6nlvcx4fs5s3fg51_pzqmc0000gm/T/chromedriver_naRFBb/.org.chromium.Chromium.wm9Z53/SingletonSocket

"/var/folders/p4/py6nlvcx4fs5s3fg51_pzqmc0000gm/T/chromedriver_naRFBb/.org.chromium.Chromium.wm9Z53/SingletonSocket" is 114 chars long

I think arraysize(addr->sun_path) == 104 according to https://developer.apple.com/legacy/library/documentation/Darwin/Reference/ManPages/man4/unix.4.html

So yeah, 114 > 104. The user data dir is in a temp dir, does anyone know why the paths are so long now?

For comparison, from one of the previous builds I see that the user data dir was put in a different location (with a much shorter path):

--user-data-dir=/tmp/chromedriver_Mn8oi2/.org.chromium.Chromium.ys8WDi

from http://chromedriver-data.storage.googleapis.com/server_logs/chromedriver_log_7HSiuO
Here's a CL to fix: https://chromium-review.googlesource.com/c/449066/

This should change the TMPDIR to something like /b/rr/Nj65pQS5fn, which I think should fix this bug. Since it's late on a friday though, I'll land this Monday morning (PST), if that's alright.
Project Member

Comment 11 by bugdroid1@chromium.org, Mar 6 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/build/+/28b65c3651b9fbf1d8a8e08d4534714748c98d44

commit 28b65c3651b9fbf1d8a8e08d4534714748c98d44
Author: Robbie Iannucci <iannucci@chromium.org>
Date: Mon Mar 06 21:22:26 2017

Revert "[*_run.py] set $TEMP/$TMPDIR for {annotated,remote}_run.py."

This reverts commit 93422fb16bcbe181d827f18db65c86f375a1f885.

Reason for revert: failure related to cloudtail: https://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_rel_ng/builds/394942/steps/Recipe%20engine%20bug/logs/stdio
Original change's description:
> [*_run.py] set $TEMP/$TMPDIR for {annotated,remote}_run.py.
> 
> BUG= 697941 
> 
> Change-Id: I6313c27c46af4ad5cfd904533d54bf4a7dafae8d
> Reviewed-on: https://chromium-review.googlesource.com/449066
> Commit-Queue: Robbie Iannucci <iannucci@chromium.org>
> Reviewed-by: Daniel Jacques <dnj@chromium.org>
> 

TBR=iannucci@chromium.org,dnj@chromium.org,chromium-reviews@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG= 697941 

Change-Id: I641a07f7d75cf874b849ef5cdb69df4cd888d604
Reviewed-on: https://chromium-review.googlesource.com/450598
Reviewed-by: Robbie Iannucci <iannucci@chromium.org>
Commit-Queue: Robbie Iannucci <iannucci@chromium.org>

[modify] https://crrev.com/28b65c3651b9fbf1d8a8e08d4534714748c98d44/scripts/slave/robust_tempdir.py
[modify] https://crrev.com/28b65c3651b9fbf1d8a8e08d4534714748c98d44/scripts/slave/annotated_run.py
[modify] https://crrev.com/28b65c3651b9fbf1d8a8e08d4534714748c98d44/scripts/slave/remote_run.py
[modify] https://crrev.com/28b65c3651b9fbf1d8a8e08d4534714748c98d44/scripts/slave/unittests/remote_run_test.py

Project Member

Comment 12 by bugdroid1@chromium.org, Mar 6 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/bb0e33882bce1a7dc0b1a2c27149909bb755052c

commit bb0e33882bce1a7dc0b1a2c27149909bb755052c
Author: samuong <samuong@chromium.org>
Date: Mon Mar 06 21:53:36 2017

[chromedriver] Remove _CleanTmpDir and use $TMPDIR directly on bots.

Previously we were creating a subdirectory under $TMPDIR so that it could be
easily cleaned up by _CleanTmpDir (see  crbug.com/598366  for context).

We'll be getting a new unique temp dir per build once  crbug.com/697941  is fixed.
This makes our _CleanTmpDir function redundant, so I'm removing it.

BUG= 697941 

Review-Url: https://codereview.chromium.org/2736433002
Cr-Commit-Position: refs/heads/master@{#454969}

[modify] https://crrev.com/bb0e33882bce1a7dc0b1a2c27149909bb755052c/chrome/test/chromedriver/run_buildbot_steps.py

Project Member

Comment 13 by bugdroid1@chromium.org, Mar 6 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/build/+/4236db6cb072c55f9483bd670fdfd9acfd3a95a1

commit 4236db6cb072c55f9483bd670fdfd9acfd3a95a1
Author: Robert Iannucci <iannucci@chromium.org>
Date: Mon Mar 06 22:22:27 2017

Reland "[*_run.py] set $TEMP/$TMPDIR for {annotated,remote}_run.py."
Reland "[robust_tmpdir.py] set $TMP too"

Originally this only set $TEMP and $TMPDIR. Unfortunately that meant
that on windows $TMP had a different value than both. Most of the
time that's OK (but silly), however the goma recipe module in
depot_tools ends up using BOTH $TMPDIR and $TMP on windows, leading
to path confusion.

This reland won't transition through the intermediate state.

BUG= 697941 ,698851

Change-Id: Ie0469fa5d420c10ec8ad2d2fa045d3b929b223a1
Reviewed-on: https://chromium-review.googlesource.com/450697
Reviewed-by: Daniel Jacques <dnj@chromium.org>
Commit-Queue: Robbie Iannucci <iannucci@chromium.org>

[modify] https://crrev.com/4236db6cb072c55f9483bd670fdfd9acfd3a95a1/scripts/slave/robust_tempdir.py
[modify] https://crrev.com/4236db6cb072c55f9483bd670fdfd9acfd3a95a1/scripts/slave/annotated_run.py
[modify] https://crrev.com/4236db6cb072c55f9483bd670fdfd9acfd3a95a1/scripts/slave/remote_run.py
[modify] https://crrev.com/4236db6cb072c55f9483bd670fdfd9acfd3a95a1/scripts/slave/unittests/remote_run_test.py

Thanks Robbie & Sam for fixing the Mac bot issue. Mac bot is running fine since then.

However, I noticed that Chromedriver logs are no longer archived on the desktop bots. It displays "This build is configured to send log data exclusively to LogDog. Please click the LogDog link on the build page to view this log stream."

https://build.chromium.org/p/chromium.chromedriver/builders/Mac%2010.6/builds/12264/steps/archive%20chromedriver%20server%20logs/logs/stdio

I am not sure how to get the logs from logdog :(

Robbie, could you please point to any docs on how to use logdog?
Owner: d...@chromium.org
Dan is the logdog guy. I'm not sure why there's no link on that builder page though.

Comment 16 by d...@chromium.org, Mar 24 2017

Owner: iannucci@chromium.org
Oh, interesting. This suggests that the step didn't actually produce logs: https://screenshot.googleplex.com/2j9B63Qu0rB

Good news is that you're not missing anything. Bad news is that you hit an edge case in my template hack. I'm going to fix that elsewhere.

Comment 17 by d...@chromium.org, Mar 24 2017

Owner: ----
Status: Fixed (was: Assigned)
Marking Fixed, since the actual underlying problem is resolved.
And also all new builds are correctly linking to logdog too; I think this particular build just got caught by a couple different things at the same time.
Cc: iannucci@chromium.org
Owner: d...@chromium.org
Status: Assigned (was: Fixed)
Sorry, not sure if I am looking at the correct subpage. But I still don't see any logdog link at any of the Chromedriver desktop bots. https://build.chromium.org/p/chromium.chromedriver/builders/Linux/builds/26451/steps/archive%20chromedriver%20server%20logs/logs/stdio

Also, not sure if we are on same page, I am referring to chromedriver server 'verbose' logs not being archived. Chromedriver test execution output log is correctly displayed.
1) Execution logs are available at the sublinks under each test run. For example - https://luci-logdog.appspot.com/v/?s=chromium%2Fbb%2Fchromium.chromedriver%2FLinux%2F26373%2F%2B%2Frecipes%2Fsteps%2Fpython_tests_v56_%2F0%2Fstdout

But
2) Chromedriver verbose logs are not available under 'archive chromedriver server logs' step for desktop bots.

whereas, for chromedriver android bot, chromedriver verbose logs are successfully archived. Please see the attached screenshot.
ChromedriverAndroidBotScreenshot.png
402 KB View Download

Comment 20 by d...@chromium.org, Mar 27 2017

I'm not entirely sure what the problem here is. I looked the build. All of the links are LogDog links, as they should be.

1) "Execution logs are not available". The build that you linked doesn't have a "python_tests chromium" step. It has other nested steps (goma), and LogDog links are working properly for them.

The build that you linked does not emit nested steps (see "STEP_NEST_LEVEL annotations, only are emitted for goma steps). If it was supposed to, this is a bug in your recipe, not LogDog.

2) "Chromedriver verbose logs are not available".

I covered this in #16: there are no logs for that step. If there was a BuildBot link, it would just be empty. An "stdio" link showing that references a non-existent LogDog stream is an unfortunate edge case in the template logic, and it is non-trivial to fix this edge case. If you wouldn't mind, could you just ignore it?
The 'archive chromedriver server logs' step just copies the chromedriver log files to google storage - https://cs.chromium.org/chromium/src/chrome/test/chromedriver/run_buildbot_steps.py?q=run_buildbot_steps.py+package:%5Echromium$&l=61

I think LogDog takes the stdout(/stderr?) stream and logs those, so this might not be exactly related.

It did, however, break at around the same time. Is there any chance that your LogDog changes would break upload to GS?

Comment 22 by d...@chromium.org, Mar 27 2017

It seems unlikely that the switch to LogDog-only mode would have broken your archive step.

LogDog reads the STDOUT/STDERR data and creates streams for them. It does, however, do this lazily, meaning that it creates an "stdout" stream when data is observed on STDOUT. If a step never sends STDOUT data, a LogDog stream for that data will never be emitted.
Cc: -samu...@chromium.org d...@chromium.org
Owner: samu...@chromium.org
It's probably not due to LogDog then. This probably regressed when we messed around with TMPDIR recently.

I took a look at the code for that step, and I think I've found the problem. https://codereview.chromium.org/2777233002 might fix it, and if not it will add some logging which could give us some clues about why it's failing.
Project Member

Comment 24 by bugdroid1@chromium.org, Mar 27 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/4347e0851af0acb9c6e0d5c17cec03ef6a167749

commit 4347e0851af0acb9c6e0d5c17cec03ef6a167749
Author: samuong <samuong@chromium.org>
Date: Mon Mar 27 20:04:07 2017

[chromedriver] Fix path for server logs and add logging.

BUG= 697941 
TBR=stgao@chromium.org

Review-Url: https://codereview.chromium.org/2777233002
Cr-Commit-Position: refs/heads/master@{#459862}

[modify] https://crrev.com/4347e0851af0acb9c6e0d5c17cec03ef6a167749/chrome/test/chromedriver/run_buildbot_steps.py

Status: Verified (was: Assigned)
Dan, thanks for your time.

Sam, thanks for the fix. Chromedriver verbose logs are now successfully being archived. https://build.chromium.org/p/chromium.chromedriver/builders/Linux/builds/26457

Sign in to add a comment