cyan HWTest failure: mysql is not started on cros-full-0031.mtv.corp.google.com |
|||
Issue descriptionI observe a few canary builders fail at HWTest with "swarming internal error": veyron_mickey: https://luci-logdog.appspot.com/v/?s=chromeos%2Fbb%2Fchromeos%2Fveyron_mickey-release%2F2180%2F%2B%2Frecipes%2Fsteps%2FHWTest__bvt-arc_%2F0%2Fstdout snappy: https://luci-logdog.appspot.com/v/?s=chromeos%2Fbb%2Fchromeos%2Fsnappy-release%2F1702%2F%2B%2Frecipes%2Fsteps%2FHWTest__bvt-arc_%2F0%2Fstdout cyan: https://luci-logdog.appspot.com/v/?s=chromeos%2Fbb%2Fchromeos%2Fcyan-release%2F2163%2F%2B%2Frecipes%2Fsteps%2FHWTest__bvt-arc_%2F0%2Fstdout
,
May 8 2018
1. Failure for veyron_mickey is just test failure: "Reason: Tests were retried." http://cautotest-prod/afe/#tab_id=view_job&object_id=198118981 2. Failure for snappy is suite got aborted: "Reason: Suite job failed." http://cautotest-prod/afe/#tab_id=view_job&object_id=198122975 3. Failure for cyan is "Reason: No test views found.", which means run_suite.py fails to call afe or tko db to get the test results. Swarming is working well mostly, but I do see a bot died during one of these tasks, that's probably the "internal error". But it will get retried automatically by another bot. So nothing to worry about it. TODO: For fixing 3, after checking, I think the problem is afe db is not connectable. The problematic server is "cros-full-0031.mtv.corp.google.com": chromeos-test@cros-full-0031:~$ mysql -uroot -pautotest chromeos_autotest_db ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2) Assign it to @deputy to fix it.
,
May 9 2018
There's no mysql under /etc/init.d/ in drones, so 'service mysql start' doesn't work for drone like "cros-full-0031". Is puppet responsible for installing and starting mysql in drones ? @ayatane
,
May 9 2018
Drones don't have mysql. Drone should be connecting to master db. Re #2 sounds like CloudSQL TKO flake from recent migration
,
May 9 2018
if it's TKO connection flake, we've just switched the IP to the primary external IP today.
,
May 9 2018
The TKO migration has been committed into CQ. All the servers should be talking to the TKO primary external IP address now. Closing this bug for now. If there are other infra issues. Please feel free to open separate bug.
,
May 9 2018
The CL which switched the IP. https://chrome-internal-review.googlesource.com/c/chromeos/chromeos-admin/+/620845 |
|||
►
Sign in to add a comment |
|||
Comment 1 by ayatane@chromium.org
, May 8 2018Status: Assigned (was: Untriaged)