New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 876358 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner:
Last visit > 30 days ago
Closed: Aug 30
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug



Sign in to add a comment

PFQ failing for BVT test failure with peach-pit

Project Member Reported by jen...@chromium.org, Aug 21

Issue description

peach_pit pfq failed to run bvt tests.See error log in:
https://luci-logdog.appspot.com/logs/chromeos/buildbucket/cr-buildbucket.appspot.com/8937610431291682592/+/steps/HWTest__bvt-inline_/0/stdout

The following error repeated many times in the error log:
Triggered task: peach_pit-chrome-pfq/R70-10986.0.0-rc1-bvt-inline
chromeos-golo-server2-232: 3f74c379a5643f10 1
  Traceback (most recent call last):
    File "/usr/local/autotest/site_utils/run_suite.py", line 2078, in <module>
      sys.exit(main())
    File "/usr/local/autotest/site_utils/run_suite.py", line 2067, in main
      result = _run_task(options)
    File "/usr/local/autotest/site_utils/run_suite.py", line 2002, in _run_task
      return _run_suite(options)
    File "/usr/local/autotest/site_utils/run_suite.py", line 1773, in _run_suite
      return _handle_job_wait(afe, job_id, options, job_timer, is_real_time)
    File "/usr/local/autotest/site_utils/run_suite.py", line 1826, in _handle_job_wait
      _poke_buildbot_with_output(afe, job_id, job_timer)
    File "/usr/local/autotest/site_utils/run_suite.py", line 1985, in _poke_buildbot_with_output
      rpc_helper.diagnose_job(job_id, afe.server)
    File "/usr/local/autotest/site_utils/diagnosis_utils.py", line 345, in diagnose_job
      hostqueueentry__complete=False)
    File "/usr/local/autotest/server/frontend.py", line 565, in get_jobs
      jobs_data = self.run('get_jobs_summary', **dargs)
    File "/usr/local/autotest/server/cros/dynamic_suite/frontend_wrappers.py", line 131, in run
      self, call, **dargs)
    File "/usr/local/autotest/site-packages/chromite/lib/retry_util.py", line 249, in GenericRetry
      return _run()
    File "/usr/local/autotest/site-packages/chromite/lib/retry_util.py", line 182, in _Wrapper
      ret = func(*args, **kwargs)
    File "/usr/local/autotest/site-packages/chromite/lib/retry_util.py", line 248, in _run
      return functor(*args, **kwargs)
    File "/usr/local/autotest/server/cros/dynamic_suite/frontend_wrappers.py", line 94, in _run
      return super(RetryingAFE, self).run(call, **dargs)
    File "/usr/local/autotest/server/frontend.py", line 108, in run
      result = utils.strip_unicode(rpc_call(**dargs))
    File "/usr/local/autotest/frontend/afe/json_rpc/proxy.py", line 143, in __call__
      raise BuildException(resp['error'])
  autotest_lib.frontend.afe.json_rpc.proxy.JSONRPCException: OperationalError: (2003, "Can't connect to MySQL server on '35.193.83.139' (110)")
 
Labels: -Pri-3 Pri-2
Owner: gu...@chromium.org
Status: Assigned (was: Untriaged)
The most recent run is green. so this should be a one time issue.
https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?buildbucketId=8937558611333144352

But the subsequent run failed for timeout when running autotests.
https://luci-logdog.appspot.com/logs/chromeos/buildbucket/cr-buildbucket.appspot.com/8937534644086911312/+/steps/HWTest__bvt-inline_/0/stdout

1;31m23:50:24: ERROR: Timeout occurred- waited 20204 seconds, failing. Timeout reason: This build has reached the timeout deadline set by the master. Either this stage or a previous one took too long (see stage timing historical summary in ReportStage) or the build failed to start on time.

Could be another flaky issue, but I wonder if the bot has anything wrong in infra that tends to be flaky?

We have many issue this week in lab. A typical problem was job get timed out. I saw it has been much greener. So let's wait to next Mon and see again.
Owner: jrbarnette@chromium.org
Status: Unconfirmed (was: Assigned)
-> current deputy
Status: WontFix (was: Unconfirmed)
>   autotest_lib.frontend.afe.json_rpc.proxy.JSONRPCException: OperationalError: (2003, "Can't connect to MySQL server on '35.193.83.139' (110)")

This error indicates that the TKO database server was rejecting
connections.  I don't know why/how that happened, but if such events
were widespread, we'd know.

Moreover, the peach_pit-pfq has been green for several builds, so any
failures from last week would appear to be obsolete.

Sign in to add a comment