New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 632735 link

Starred by 4 users

Issue metadata

Status: Fixed
Owner: ----
Closed: Aug 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug



Sign in to add a comment

"remote_run_result" is flaky

Project Member Reported by chromium...@appspot.gserviceaccount.com, Jul 29 2016

Issue description

"remote_run_result" is flaky.

This issue was created automatically by the chromium-try-flakes app. Please find the right owner to fix the respective test/step and assign this issue to them. If the step/test is infrastructure-related, please add Infra-Troopers label and change issue status to Untriaged. When done, please remove the issue from Sheriff Bug Queue by removing the Sheriff-Chromium label.

We have detected 4 recent flakes. List of all flakes can be found at https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyHAsSBUZsYWtlIhFyZW1vdGVfcnVuX3Jlc3VsdAw.


 
Cc: mar...@chromium.org
Labels: -Sheriff-Chromium
Owner: phajdan.jr@chromium.org
Status: Assigned (was: Untriaged)
These are things like

@@@STEP_LOG_LINE@exception@Traceback (most recent call last):@@@
@@@STEP_LOG_LINE@exception@  File "../../../scripts/slave/remote_run.py", line 226, in shell_main@@@
@@@STEP_LOG_LINE@exception@    return main(argv)@@@
@@@STEP_LOG_LINE@exception@  File "../../../scripts/slave/remote_run.py", line 211, in main@@@
@@@STEP_LOG_LINE@exception@    with open(recipe_result_path) as f:@@@
@@@STEP_LOG_LINE@exception@IOError: [Errno 2] No such file or directory: 'E:\\b\\rr\\tmpxefl9j\\recipe_result.json'@@@

Possibly related: Issue 627330 which clued me in to try changing the url to "Uncaught%20Exception/logs/exception"

https://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_x64_rel_ng/builds/252356/steps/Uncaught%20Exception/logs/exception

gives

Traceback (most recent call last):
  File "E:\b\rr\tmpxefl9j\rw\checkout\scripts\slave\.recipe_deps\recipe_engine\recipe_engine\run.py", line 420, in run
    recipe_result = recipe_script.run(api, api._engine.properties)
  File "E:\b\rr\tmpxefl9j\rw\checkout\scripts\slave\.recipe_deps\recipe_engine\recipe_engine\loader.py", line 59, in run
    self.RunSteps, properties, self.PROPERTIES, api=api)
  File "E:\b\rr\tmpxefl9j\rw\checkout\scripts\slave\.recipe_deps\recipe_engine\recipe_engine\loader.py", line 514, in invoke_with_properties
    **additional_args)
  File "E:\b\rr\tmpxefl9j\rw\checkout\scripts\slave\.recipe_deps\recipe_engine\recipe_engine\loader.py", line 475, in _invoke_with_properties
    return callable_obj(*props, **additional_args)
  File "E:\b\rr\tmpxefl9j\rw\checkout\scripts/slave\recipes\chromium_trybot.py", line 242, in RunSteps
    bot_config_object, api, tests, bot_update_step, affected_files)
  File "E:\b\rr\tmpxefl9j\rw\checkout\scripts\slave\.recipe_deps\recipe_engine\recipe_engine\recipe_api.py", line 239, in _inner
    return func(*a, **kw)
  File "E:\b\rr\tmpxefl9j\rw\checkout\scripts/slave\recipe_modules\chromium_tests\api.py", line 688, in run_tests_on_tryserver
    self.m.test_utils.determine_new_failures(api, tests, deapply_patch_fn)
  File "E:\b\rr\tmpxefl9j\rw\checkout\scripts\slave\.recipe_deps\recipe_engine\recipe_engine\recipe_api.py", line 239, in _inner
    return func(*a, **kw)
  File "E:\b\rr\tmpxefl9j\rw\checkout\scripts/slave\recipe_modules\test_utils\api.py", line 150, in determine_new_failures
    failing_tests = self.run_tests_with_patch(caller_api, tests)
  File "E:\b\rr\tmpxefl9j\rw\checkout\scripts\slave\.recipe_deps\recipe_engine\recipe_engine\recipe_api.py", line 239, in _inner
    return func(*a, **kw)
  File "E:\b\rr\tmpxefl9j\rw\checkout\scripts/slave\recipe_modules\test_utils\api.py", line 121, in run_tests_with_patch
    self.run_tests(caller_api, tests, 'with patch')
  File "E:\b\rr\tmpxefl9j\rw\checkout\scripts\slave\.recipe_deps\recipe_engine\recipe_engine\recipe_api.py", line 239, in _inner
    return func(*a, **kw)
  File "E:\b\rr\tmpxefl9j\rw\checkout\scripts/slave\recipe_modules\test_utils\api.py", line 111, in run_tests
    t.post_run(caller_api, suffix, test_filter=test_filters.get(t.name))
  File "E:\b\rr\tmpxefl9j\rw\checkout\scripts/slave\recipe_modules\chromium_tests\steps.py", line 1197, in post_run
    return self._test.post_run(api, suffix, test_filter=test_filter)
  File "E:\b\rr\tmpxefl9j\rw\checkout\scripts/slave\recipe_modules\chromium_tests\steps.py", line 912, in post_run
    api, suffix,test_filter=test_filter)
  File "E:\b\rr\tmpxefl9j\rw\checkout\scripts/slave\recipe_modules\chromium_tests\steps.py", line 814, in post_run
    valid, failures = self.validate_task_results(api, api.step.active_result)
  File "E:\b\rr\tmpxefl9j\rw\checkout\scripts/slave\recipe_modules\chromium_tests\steps.py", line 906, in validate_task_results
    return True, gtest_results.failures
AttributeError: 'GTestResults' object has no attribute 'failures'


That doesn't give me any additional clues how to route this though
Project Member

Comment 2 by chromium...@appspot.gserviceaccount.com, Aug 1 2016

Detected 3 new flakes for test/step "remote_run_result". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyHAsSBUZsYWtlIhFyZW1vdGVfcnVuX3Jlc3VsdAw. This message was posted automatically by the chromium-try-flakes app.
Project Member

Comment 3 by chromium...@appspot.gserviceaccount.com, Aug 2 2016

Detected 3 new flakes for test/step "remote_run_result". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyHAsSBUZsYWtlIhFyZW1vdGVfcnVuX3Jlc3VsdAw. This message was posted automatically by the chromium-try-flakes app.
Project Member

Comment 4 by chromium...@appspot.gserviceaccount.com, Aug 10 2016

Detected 3 new flakes for test/step "remote_run_result". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyHAsSBUZsYWtlIhFyZW1vdGVfcnVuX3Jlc3VsdAw. This message was posted automatically by the chromium-try-flakes app.
Project Member

Comment 5 by chromium...@appspot.gserviceaccount.com, Aug 16 2016

Detected 3 new flakes for test/step "remote_run_result". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyHAsSBUZsYWtlIhFyZW1vdGVfcnVuX3Jlc3VsdAw. This message was posted automatically by the chromium-try-flakes app.
Cc: nicholaslin@google.com jbudorick@chromium.org
Components: Infra
adding jbudorick@ and nicholaslin@ in case they might help route this bug to the right person (since per #c1 this might be related to .../recipe_modules/chromium_tests/steps.py (*) and they have recently touched this file [probably before this problem started though...]).

(*) https://chromium.googlesource.com/chromium/tools/build/+log/master/scripts/slave/recipe_modules/chromium_tests/steps.py
Labels: -Pri-1 Pri-0
Looking at https://build.chromium.org/p/chromium.fyi/waterfall, it seems that >50% of few fyi bots are currently purple because of "remote_run_result" step.  I think this justifies increasing the priority of the bug.
Labels: Infra-Troopers
The failure seems to be coming from infrastructure part (i.e. doesn't seem related to Chromium's product or test code) - let me add this bug to the infra trooper queue.
Cc: benhenry@chromium.org
Cc: phajdan.jr@chromium.org
Labels: -Pri-0 Pri-1
Owner: ----
Status: Untriaged (was: Assigned)
Unassigning to actually get this into the trooper queue (maybe?)

I suspect this isn't a P0 based on its impact on chromium.fyi but will let someone on the infra team make that determination.
Cc: -nicholaslin@google.com
(and nicholislin is no longer here)
Labels: -Pri-1 Pri-0
oops, meant to leave at P0 pending infra team review.
Cc: -mar...@chromium.org kbr@chromium.org stip@chromium.org dpranke@chromium.org mikec...@chromium.org
I never touched this file (probably tapted@ hadn't realized) and have no clue how this works.

$ git log --format="%ae" -- steps.py | sort | uniq -c | sort -n -r | head -n 20
     31 jbudorick@chromium.org
     20 mikecase@chromium.org
     15 phajdan.jr@chromium.org
     14 kbr@chromium.org
     13 stip@chromium.org
     11 dpranke@chromium.org
      7 agrieve@chromium.org
      5 sergiyb@chromium.org
      4 yolandyan@chromium.org
      3 rnephew@chromium.org
      3 nicholaslin@google.com
      3 nednguyen@google.com
      3 estaab@chromium.org
      3 dtu@chromium.org
      3 bpastene@chromium.org
      2 prasadv@chromium.org
      2 newt@chromium.org
      2 msw@chromium.org
      2 luqui@chromium.org
      2 fdoray@chromium.org


Labels: -Pri-0 Pri-2
#13: huh, that's surprising.

In any event, after looking a bit more, the issue raised in #6 appears to be issue 639891.

I can't tell what was going on originally -- the logs have evaporated -- but GTestResults has a failures object (https://codesearch.chromium.org/chromium/build/scripts/slave/recipe_modules/test_utils/util.py?rcl=0&l=106) unless jsonish is unspecified...

Dropping priority as the chromium.fyi bots appear to be fixed.
Project Member

Comment 15 by chromium...@appspot.gserviceaccount.com, Aug 23 2016

Labels: Sheriff-Chromium
Detected 6 new flakes for test/step "remote_run_result". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyHAsSBUZsYWtlIhFyZW1vdGVfcnVuX3Jlc3VsdAw. This message was posted automatically by the chromium-try-flakes app. Since flakiness is ongoing, the issue was moved back into Sheriff Bug Queue (unless already there).
Hmmm... we are back at 20 remote_run_result-related failures on https://build.chromium.org/p/chromium.fyi/waterfall :-(
Looking at the failure reason step, I'm seeing multiple instances of "RecipeApi has no dependency 'X'", where X includes at least "python", "path", and "isolate". Seems like a broken recipe commit of some kind?

Comment 19 by mmoss@chromium.org, Aug 25 2016

Status: Fixed (was: Untriaged)
Looking at some of the failures, I'm not sure how "flaky" this step is being. There are several failures, sure, but it doesn't seem to be random. Like the 20+ failures yesterday all happened within about two minutes of each other, and look to all be the same fetch issue, maybe indicating a momentary network or gerrit hiccup.

At the very least, I don't think all these failures should be conflated in one bug,  which is just confusing, and obviously not helping to get anything resolved. I'm going to close this, as having fixed the original issue (from a month ago), and set the flakes app bug to 0, so it will start fresh if there are new failures.

Sign in to add a comment