gale & whirlwind DUTs failed due to "Not enough free inodes on /mnt/stateful_partition" |
|||||||||||
Issue descriptiongale-paladin:3139 failed Builders failed on: - gale-paladin: https://luci-milo.appspot.com/buildbot/chromeos/gale-paladin/3139 History of gale-paladin: 8 failed build(s) in a row; Last 10 builds: 8 failed, 2 pass It's network failure? https://luci-logdog.appspot.com/v/?s=chromeos%2Fbb%2Fchromeos%2Fgale-paladin%2F3139%2F%2B%2Frecipes%2Fsteps%2FHWTest__jetstream_cq_%2F0%2Fstdout
,
Jul 10 2017
xixuan@, could you take a look?
,
Jul 10 2017
,
Jul 10 2017
Is it related to the recent swarming proxy outage crbug.com/738139 ?
,
Jul 10 2017
Other possibility is some bad change is between 9725.0.0 and 9726.0.0 https://crosland.corp.google.com/log/9725.0.0..9726.0.0
,
Jul 10 2017
The range contains https://chromium-review.googlesource.com/c/529365 (enable network sandbox for builds). Could it be related?
,
Jul 10 2017
,
Jul 10 2017
it is probably not related to the network sandbox change. that should only impact build and unittest phases. this error is in the hwtest phase and we don't run ebuild commands there.
,
Jul 10 2017
My thought for the failure reason is no cq DUTs for gale is healthy: https://chromeos-proxy.appspot.com/task?id=3744eb065fafff10&refresh=10&show_raw=1 nothing related to swarming proxy.
,
Jul 10 2017
Checked 3 gale DUTs of all 6 failed DUTs, they failed in the same patterns: After a new test's provision: Reset failed: due to "Not enough free inodes on /mnt/stateful_partition", example: https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/hosts/chromeos4-row9-jetstream-host3/60954194-reset Repair failed: repair.rpm, repair.jetstream_repair, repair.au & repair.powerwash failed, example: https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/hosts/chromeos4-row9-jetstream-host3/60954202-repair Problems come to 1) why no enough free inodes, 2) the reason we can't repair it.
,
Jul 10 2017
Filed https://b.corp.google.com/issues/63524032 first. Let sheriff @skau debug why this thing happens to all gale DUTs.
,
Jul 10 2017
,
Jul 10 2017
I currently suspect that a bad CL got into the CQ and killed all the DUTs. We'll see what happens after the lab recovers them.
,
Jul 10 2017
Gale repairs were failing due to servos being inaccessible. That has been fixed now, b/63506983. I don't know why the inode failure is showing up now.
,
Jul 10 2017
The inode failure might be a bad CL in CQ. I'll keep an eye on it.
,
Jul 24 2017
Pri-0 bugs are critical regressions or serious emergencies, and this bug has not been updated in three days. Could you please provide an update, or adjust the priority to a more appropriate level if applicable? If a fix is in active development, please set the status to Started. Thanks for your time! To disable nags, add the Disable-Nags label. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Aug 8 2017
Pri-0 bugs are critical regressions or serious emergencies, and this bug has not been updated in three days. Could you please provide an update, or adjust the priority to a more appropriate level if applicable? If a fix is in active development, please set the status to Started. Thanks for your time! To disable nags, add the Disable-Nags label. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Aug 8 2017
,
Aug 8 2017
It has passed about 1 month, so assume it's already fixed or never happen. Mark it as wontfix. |
|||||||||||
►
Sign in to add a comment |
|||||||||||
Comment 1 by oka@chromium.org
, Jul 10 2017