devserver "health check" alert firing for devservers across the board |
||||
Issue descriptionAlerts firing for a wide variety of devservers. Problems: 1) It is too hard to identify from the alerts that have an IP address which devserver is responsible. 2) The alert links to a dashboard that shows failure rate (per minute) rather than failure fraction, which is what is actually alerted on. I'm not yet sure if the alerts are meaningful. Filing this for now for follow-up.
,
Jul 5 2017
Some devservers that are failing: 100.115.99.250 chromeos9-infra-devserver1.cros.corp.google.com chromeos9-infra-devserver.cros.corp.google.com chromeos9-infra-devserver2.cros.corp.google.com 100.115.99.252 100.115.99.251
,
Jul 7 2017
Issue 739904 has been merged into this issue.
,
Jul 7 2017
,
Jul 10 2017
I took care of this. We were waiting for a cl to go through over the weekend to remove the old vmware (dup) mac address. The devservers are now online and you should no longer see any faults. jashur@jashur:~$ ping chromeos9-infra-devserver PING chromeos9-infra-devserver.cros.corp.google.com (100.115.99.252) 56(84) bytes of data. 64 bytes from 100.115.99.252: icmp_seq=1 ttl=62 time=0.543 ms 64 bytes from 100.115.99.252: icmp_seq=2 ttl=62 time=0.514 ms 64 bytes from 100.115.99.252: icmp_seq=3 ttl=62 time=0.517 ms jashur@jashur:~$ ping chromeos9-infra-devserver1 PING chromeos9-infra-devserver1.cros.corp.google.com (100.115.99.251) 56(84) bytes of data. 64 bytes from 100.115.99.251: icmp_seq=1 ttl=62 time=0.645 ms 64 bytes from 100.115.99.251: icmp_seq=2 ttl=62 time=0.649 ms 64 bytes from 100.115.99.251: icmp_seq=3 ttl=62 time=0.662 ms
,
Jul 10 2017
|
||||
►
Sign in to add a comment |
||||
Comment 1 by akes...@chromium.org
, Jul 5 2017