New issue
Advanced search Search tips

Issue 732584 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner:
Closed: Aug 2017
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

"telemetry.internal.platform.tracing_agent.cpu_tracing_agent_unittest.CpuTracingAgentTest.testCollectAgentTraceDataBeforeStop" is flaky

Project Member Reported by chromium...@appspot.gserviceaccount.com, Jun 12 2017

Issue description

"telemetry.internal.platform.tracing_agent.cpu_tracing_agent_unittest.CpuTracingAgentTest.testCollectAgentTraceDataBeforeStop" is flaky.

This issue was created automatically by the chromium-try-flakes app. Please find the right owner to fix the respective test/step and assign this issue to them. If the step/test is infrastructure-related, please add Infra-Troopers label and change issue status to Untriaged. When done, please remove the issue from Sheriff Bug Queue by removing the Sheriff-Chromium label.

We have detected 3 recent flakes. List of all flakes can be found at https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyhwELEgVGbGFrZSJ8dGVsZW1ldHJ5LmludGVybmFsLnBsYXRmb3JtLnRyYWNpbmdfYWdlbnQuY3B1X3RyYWNpbmdfYWdlbnRfdW5pdHRlc3QuQ3B1VHJhY2luZ0FnZW50VGVzdC50ZXN0Q29sbGVjdEFnZW50VHJhY2VEYXRhQmVmb3JlU3RvcAw.

Flaky tests should be disabled within 30 minutes unless culprit CL is found and reverted. Please see more details here: https://sites.google.com/a/chromium.org/dev/developers/tree-sheriffs/sheriffing-bug-queues#triaging-auto-filed-flakiness-bugs
 
Labels: -Sheriff-Chromium
Owner: charliea@chromium.org
Status: Assigned (was: Untriaged)
charliea@: can you take a look?
Happy to: thanks for letting me know about it.
Interesting: it seems like all of the errors look like:

[217/558] telemetry.internal.platform.tracing_agent.cpu_tracing_agent_unittest.CpuTracingAgentTest.testWindowsCanHandleProcessesWithSpaces failed unexpectedly 1.5790s:
  Traceback (most recent call last):
    File "e:\b\swarm_slave\w\ir\third_party\catapult\telemetry\telemetry\internal\platform\tracing_agent\cpu_tracing_agent_unittest.py", line 153, in testWindowsCanHandleProcessesWithSpaces
      proc_collector.Init()
    File "e:\b\swarm_slave\w\ir\third_party\catapult\telemetry\telemetry\internal\platform\tracing_agent\cpu_tracing_agent.py", line 115, in Init
      self._GetProcessesAsStrings()
    File "e:\b\swarm_slave\w\ir\third_party\catapult\telemetry\telemetry\internal\platform\tracing_agent\cpu_tracing_agent.py", line 141, in _GetProcessesAsStrings
      self._GET_PERF_DATA_SHELL_COMMAND).strip().split('\n')[2:]
    File "e:\b\depot_tools\python276_bin\lib\subprocess.py", line 573, in check_output
      raise CalledProcessError(retcode, cmd, output=output)
  CalledProcessError: Command '['wmic', 'path', 'Win32_PerfFormattedData_PerfProc_Process', 'get', 'CreatingProcessID,IDProcess,Name,PercentProcessorTime,WorkingSet']' returned non-zero exit status -1073738817

It's really strange that the wmic query would return with a non-zero exit status: I've never seen that.

This page (https://msdn.microsoft.com/en-us/library/aa394574(v=vs.85).aspx) says that "If an error occurs, WMI returns one of the following error codes as a 32-bit value where the two high-order bits indicate the severity code of the message." In our case, the exit status is -1073738817, which is 11000000 00000000 00001011 10111111 in binary. According to the chart supplied on that site, this means that there's an error. The hex representation of that is 0xC0000BBF. If we look that up in the WMI error constants file we get... nothing. There are no errors that start with C0000 even though, in the previous step, we were told that *all* errors had to start with 1 and 1 in the highest order bits. This is in direct conflict with all of the error codes listed on the WMI error codes page, which have 10 in the highest order bits.


nednguyen@ gave the good suggestion that we should just be logging the output of the wmic command when it fails (rather than going off of the return code alone). I have a CL out to do just that, and will check back in on this bug in two days, at which point we'll hopefully have a few more flakes that have happened, this time with logging information.

Unfortunately, this command seems to work the vast majority of the time, so debugging locally isn't really an option.
Project Member

Comment 5 by bugdroid1@chromium.org, Jun 13 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/2459df3e0a30326051943b468419a5465835924b

commit 2459df3e0a30326051943b468419a5465835924b
Author: catapult-deps-roller@chromium.org <catapult-deps-roller@chromium.org>
Date: Tue Jun 13 15:46:10 2017

Roll src/third_party/catapult/ 272523541..deb2d7670 (1 commit)

https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/27252354126c..deb2d7670a64

$ git log 272523541..deb2d7670 --date=short --no-merges --format='%ad %ae %s'
2017-06-13 charliea Log wmic output when the wmic command throws a CalledProcessError

Created with:
  roll-dep src/third_party/catapult
BUG= 732584 


Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md

If the roll is causing failures, see:
http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls


CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel
TBR=sullivan@chromium.org

Change-Id: I530fe969eaaf9b3d69016a2002d4afeb23169823
Reviewed-on: https://chromium-review.googlesource.com/533253
Reviewed-by: <catapult-deps-roller@chromium.org>
Commit-Queue: <catapult-deps-roller@chromium.org>
Cr-Commit-Position: refs/heads/master@{#479031}
[modify] https://crrev.com/2459df3e0a30326051943b468419a5465835924b/DEPS

Status: WontFix (was: Assigned)
It looks like these errors haven't happened in over a month: https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyhwELEgVGbGFrZSJ8dGVsZW1ldHJ5LmludGVybmFsLnBsYXRmb3JtLnRyYWNpbmdfYWdlbnQuY3B1X3RyYWNpbmdfYWdlbnRfdW5pdHRlc3QuQ3B1VHJhY2luZ0FnZW50VGVzdC50ZXN0Q29sbGVjdEFnZW50VHJhY2VEYXRhQmVmb3JlU3RvcAw

I have no idea why they stopped: it was nothing that I did. However, I'm going to close this as WontFix (obsolete) and leave it closed unless they begin to happen again.

Sign in to add a comment