New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 821227 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Mar 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

Swarming builds can't launch hwtests.

Project Member Reported by dgarr...@chromium.org, Mar 13 2018

Issue description

Swarming tryjobs such as this one:

https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?buildbucketId=8952232723342486912

Get errors like this from the swarming proxy (from xixuan@):

<NewTaskRequest (/base/data/home/apps/s~chromeos-proxy/3364-7d49171.408000810670529559/handlers_endpoints.py:381) expiration_secs: 1200 name: u'trybot-samus-release-tryjob/R67-10482.0.0-b2374980-bvt-arc' parent_task_id: u'3c347c50ef4dad11' priority: 100 properties: <TaskProperties command: [u'/usr/local/autotest/site_utils/run_suite.py', u'--build', u'trybot-samus-release-tryjob/R67-10482.0.0-b2374980', u'--board', u'samus', u'--suite_name', u'bvt-arc', u'--pool', u'suites', u'--file_bugs', u'False', u'--priority', u'Build', u'--timeout_mins', u'180', u'--retry', u'True', u'--max_retries', u'5', u'--offload_failures_only', u'False', u'--job_keyvals', u"{'cidb_build_stage_id': 73176146L, 'cidb_build_id': 2374980, 'datastore_parent_key': ('Build', 2374980, 'BuildStage', 73176146L)}", u'-c'] dimensions: [<StringPair key: u'os' value: u'Ubuntu-14.04'>, <StringPair key: u'pool' value: u'default'>] env: [] execution_timeout_secs: 14400 extra_args: [] grace_period_secs: 30 idempotent: False io_timeout_secs: 14400 caches: [] outputs: [] env_prefixes: []> tags: [u'priority:Build', u'suite:bvt-arc', u'build:trybot-samus-release-tryjob/R67-10482.0.0-b2374980', u'task_name:trybot-samus-release-tryjob/R67-10482.0.0-b2374980-bvt-arc', u'board:samus']>
2018-03-12 14:33:57.752 PDTparent_task_id is not a valid task (/base/data/home/apps/s~chromeos-proxy/3364-7d49171.408000810670529559/components/auth/endpoints_support.py:177) Traceback (most recent call last): File "components/auth/endpoints_support.py", line 174, in wrapper return func(service, *args, **kwargs) File "components/auth/api.py", line 1543, in wrapper return func(*args, **kwargs) File "handlers_endpoints.py", line 392, in new raise endpoints.BadRequestException(e.message) BadRequestException: parent_task_id is not a valid task
oops,bad
https://pantheon.corp.google.com/logs/viewer?expandAll=false&filters=path:%22%2F_ah%2Fspi%2FSwarmingTasksService.new%22&logName=projects%2Fchromeos-proxy%2Flogs%2Fappengine.googleapis.com%252Frequest_log&project=chromeos-proxy&organizationId=433637338589&resource=gae_app&minLogLevel=500&timestamp=2018-03-12T21:33:57.666242000Z&dateRangeStart=2018-03-12T18:28:22.891Z&dateRangeEnd=2018-03-13T00:28:22.891Z&interval=PT6H
You can see it in the second url, it shows "BadRequestException: parent_task_id is not a valid task"
according to the code in  chromite/third_party/swarming.client/swarming.py, looks like it's a builder-related variable:  parent_task_id=os.environ.get('SWARMING_TASK_ID', '')
I kicked off a swarming command with the same parameters in another builder, and it succeeds. Here is the parameters passed in: https://screenshot.googleplex.com/8fVnMFDGi5t
 
It looks like the solution is to stop fetching SWARMING_TASK_ID, or to unset it before kicking off the hwtest request.

This problem might be related to the fact that the build is invoked from one instance of swarming, and the proxy request is going through a different instance.
Blocking: 821615
Status: Fixed (was: Started)
Confirmed, it's fixed.

http://cros-goldeneye/chromeos/healthmonitoring/buildDetails?buildbucketId=8952033044618102768
Status: Started (was: Fixed)
Oh... I thought the relevant CL had landed for some reason.

Confirmed that it works. ;>
Blocking: -821615
Project Member

Comment 6 by bugdroid1@chromium.org, Mar 15 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/4cb4f50661ba05e369653216c4f82d13026937ac

commit 4cb4f50661ba05e369653216c4f82d13026937ac
Author: Don Garrett <dgarrett@google.com>
Date: Thu Mar 15 22:27:37 2018

swarming_lib: Remove SWARMING_TASK_ID from cmds.

If SWARMING_TASK_ID is present when running swarming proxy commands,
the proxy will try to make use of it. This fails for swarming builds,
since the swarming build instance and lab swarming proxy instance are
fully independent.

BUG= chromium:821227 
TEST=Unittests + cros tryjob --swarming -g XX samus-release-tryjob

Change-Id: I1cf18ef83b3b9e442ba7d5da63002a6d62fdf258
Reviewed-on: https://chromium-review.googlesource.com/961532
Commit-Ready: Don Garrett <dgarrett@chromium.org>
Tested-by: Don Garrett <dgarrett@chromium.org>
Reviewed-by: Don Garrett <dgarrett@chromium.org>

[add] https://crrev.com/4cb4f50661ba05e369653216c4f82d13026937ac/cbuildbot/swarming_lib_unittest.py
[modify] https://crrev.com/4cb4f50661ba05e369653216c4f82d13026937ac/cbuildbot/swarming_lib.py
[add] https://crrev.com/4cb4f50661ba05e369653216c4f82d13026937ac/cbuildbot/swarming_lib_unittest
[modify] https://crrev.com/4cb4f50661ba05e369653216c4f82d13026937ac/cbuildbot/commands_unittest.py

Status: Fixed (was: Started)
Project Member

Comment 8 by bugdroid1@chromium.org, Mar 16 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/b8e7c2d9a192a0e7c441c6f34284fea6ef68dcd5

commit b8e7c2d9a192a0e7c441c6f34284fea6ef68dcd5
Author: chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com>
Date: Fri Mar 16 22:50:01 2018

Roll src/third_party/chromite/ 3b75c9d82..3ad8f333d (31 commits)

https://chromium.googlesource.com/chromiumos/chromite.git/+log/3b75c9d82ebf..3ad8f333d567

$ git log 3b75c9d82..3ad8f333d --date=short --no-merges --format='%ad %ae %s'
2018-03-16 dgarrett Revert "Reland "pre_cq_launcher: Swarming for chromeos-infra-puppet-pre-cq.""
2018-03-16 dgarrett Reland "pre_cq_launcher: Swarming for chromeos-infra-puppet-pre-cq."
2018-03-14 ayatane autotest-pre-cq: Remove builder and stage [2/2]
2018-03-16 dgarrett Revert "pre_cq_launcher: Swarming for chromeos-infra-puppet-pre-cq."
2018-03-15 dgarrett chromeos_config: Move fuzzer builds into new bucket.
2018-03-16 dgarrett Revert "commands: RunBranchUtilTest -> RunLocalTryjob"
2018-03-13 dgarrett pre_cq_launcher: Swarming for chromeos-infra-puppet-pre-cq.
2018-02-07 dgarrett commands: RunBranchUtilTest -> RunLocalTryjob
2018-03-14 dgarrett cbuildbot_run: Switch more build links to Legoland.
2018-03-13 dgarrett swarming_lib: Remove SWARMING_TASK_ID from cmds.
2018-03-08 dgarrett moblab_vm_unitest: Fix lint issues.
2018-03-14 ihf chromeos_config: add more arcnext experimental coverage.
2018-03-14 ayatane autotest-pre-cq: Remove this [1/2]
2018-03-14 norvez chromeos_config: remove dead code
2018-03-09 dgarrett summarize_build_stats: Add blank line at beginning.
2018-01-09 dgarrett cros tryjob: Remove buildbot URL generation.
2017-09-14 craigb image_test: Remove check that kernel is not ELF.
2018-03-15 ihf Revert "chromeos_config: temporarily mark eve-arcnext-paladin experimental"
2018-03-15 ihf Revert "chromeos_config: temporarily experimental eve-arcnext-mst-android-pfq"
2018-03-13 lhchavez chromeos_config: Add betty-arcnext builder config
2018-03-13 achuith cbuildbot: Add missing files to index.
2018-03-13 akeshet completion_stages: add a has_important_slave metric to master completion
2018-03-13 dgarrett precq-launcher: Start using Legoland build details page.
2018-03-08 dgarrett chromite-pre-cq: Disable CidbIntegrationTest.
2018-03-14 akeshet chromeos_config: temporarily experimental eve-arcnext-mst-android-pfq
2018-03-13 akeshet chromeos_config: temporarily mark eve-arcnext-paladin experimental
2018-03-12 haddowk [chromite] Make guado_moblab important again
2018-03-13 chrome-bot Update config settings by config-updater.
2018-03-12 gmeinke chromium-config: replace cros_config_host_py in chromite
2018-03-12 yunlian Enable ThinLTO on all AMD64 boards.
2018-03-12 achuith cbuildbot: Log timing of GenerateUploadJSON.

Created with:
  roll-dep src/third_party/chromite
BUG=821930, 822517 , 821615 ,None,821618,821227,None,821664,821930,None,815377,747385,461595,821664,821664,811989,819419,821618,820305,821664,821664,819017,813442,707803,811989


The AutoRoll server is located here: https://chromite-chromium-roll.skia.org

Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md

If the roll is causing failures, please contact the current sheriff, who should
be CC'd on the roll, and stop the roller if necessary.


TBR=chrome-os-gardeners@chromium.org

Change-Id: Ib6aaddf338307e994865a092ecb322a432148692
Reviewed-on: https://chromium-review.googlesource.com/967273
Commit-Queue: Chromite Chromium Autoroll <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com>
Reviewed-by: Chromite Chromium Autoroll <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com>
Cr-Commit-Position: refs/heads/master@{#543855}
[modify] https://crrev.com/b8e7c2d9a192a0e7c441c6f34284fea6ef68dcd5/DEPS

Labels: Merge-Request-66 Merge-Request-65
Cc: josa...@chromium.org
Labels: -Merge-Request-65 -Merge-Request-66 Merge-Approved-66 Merge-Approved-65
Approving merge to 65 and 66.

Should be safe as this is an infra only change, if it breaks we can always revert. 
Project Member

Comment 11 by bugdroid1@chromium.org, May 1 2018

Labels: merge-merged-release-R66-10452.B
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/6e00508d133759a56900a30e30b4a05d20d4aba5

commit 6e00508d133759a56900a30e30b4a05d20d4aba5
Author: Don Garrett <dgarrett@google.com>
Date: Tue May 01 22:02:58 2018

swarming_lib: Remove SWARMING_TASK_ID from cmds.

If SWARMING_TASK_ID is present when running swarming proxy commands,
the proxy will try to make use of it. This fails for swarming builds,
since the swarming build instance and lab swarming proxy instance are
fully independent.

BUG= chromium:821227 
TEST=Unittests + cros tryjob --swarming -g XX samus-release-tryjob

Change-Id: I1cf18ef83b3b9e442ba7d5da63002a6d62fdf258
Reviewed-on: https://chromium-review.googlesource.com/961532
Commit-Ready: Don Garrett <dgarrett@chromium.org>
Tested-by: Don Garrett <dgarrett@chromium.org>
Reviewed-by: Don Garrett <dgarrett@chromium.org>
(cherry picked from commit 4cb4f50661ba05e369653216c4f82d13026937ac)
Reviewed-on: https://chromium-review.googlesource.com/1038483

[add] https://crrev.com/6e00508d133759a56900a30e30b4a05d20d4aba5/cbuildbot/swarming_lib_unittest.py
[modify] https://crrev.com/6e00508d133759a56900a30e30b4a05d20d4aba5/cbuildbot/swarming_lib.py
[add] https://crrev.com/6e00508d133759a56900a30e30b4a05d20d4aba5/cbuildbot/swarming_lib_unittest
[modify] https://crrev.com/6e00508d133759a56900a30e30b4a05d20d4aba5/cbuildbot/commands_unittest.py

Project Member

Comment 12 by bugdroid1@chromium.org, May 1 2018

Labels: merge-merged-release-R65-10323.B
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/0faade521f7b9bf4c0bb8ac0481926c0326f29a8

commit 0faade521f7b9bf4c0bb8ac0481926c0326f29a8
Author: Don Garrett <dgarrett@google.com>
Date: Tue May 01 22:03:12 2018

swarming_lib: Remove SWARMING_TASK_ID from cmds.

If SWARMING_TASK_ID is present when running swarming proxy commands,
the proxy will try to make use of it. This fails for swarming builds,
since the swarming build instance and lab swarming proxy instance are
fully independent.

BUG= chromium:821227 
TEST=Unittests + cros tryjob --swarming -g XX samus-release-tryjob

Change-Id: I1cf18ef83b3b9e442ba7d5da63002a6d62fdf258
Reviewed-on: https://chromium-review.googlesource.com/961532
Commit-Ready: Don Garrett <dgarrett@chromium.org>
Tested-by: Don Garrett <dgarrett@chromium.org>
Reviewed-by: Don Garrett <dgarrett@chromium.org>
(cherry picked from commit 4cb4f50661ba05e369653216c4f82d13026937ac)
Reviewed-on: https://chromium-review.googlesource.com/1038484

[add] https://crrev.com/0faade521f7b9bf4c0bb8ac0481926c0326f29a8/cbuildbot/swarming_lib_unittest.py
[modify] https://crrev.com/0faade521f7b9bf4c0bb8ac0481926c0326f29a8/cbuildbot/swarming_lib.py
[add] https://crrev.com/0faade521f7b9bf4c0bb8ac0481926c0326f29a8/cbuildbot/swarming_lib_unittest
[modify] https://crrev.com/0faade521f7b9bf4c0bb8ac0481926c0326f29a8/cbuildbot/commands_unittest.py

Project Member

Comment 13 by sheriffbot@chromium.org, May 7 2018

Cc: bhthompson@google.com
This issue has been approved for a merge. Please merge the fix to any appropriate branches as soon as possible!

If all merges have been completed, please remove any remaining Merge-Approved labels from this issue.

Thanks for your time! To disable nags, add the Disable-Nags label.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Labels: -Merge-Approved-65 -Merge-Approved-66

Sign in to add a comment