mysterious SERVER_JOB aborts |
||||||
Issue descriptionTracking bug to collect occurences. These look like some kind of timeout, but I don't know what is setting their timeout. They seem to timeout after ~45 minutes byt the requested suite timeout seems to be 90.
,
Jun 2 2016
,
Jun 2 2016
dshi theory -- these are test timeouts due to devserver overload. see logs in https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/65302062-chromeos-test/chromeos4-row2-rack3-host9/debug/
,
Jun 2 2016
Attached is the CPU% of the devserver causing that issue. The load around 5/31 10PM is at 100%. I added 2 more devserver to chromeos4 (100.107.x.x subnet). It's behaving much better now. There will be some followup work to choose a devserver with less gsutil process running.
,
Jun 2 2016
seems that dan is handling, assigning to him
,
Jun 2 2016
Ok, the fix to the outage is one thing. Let's also fix the logging somehow. How can we communicate better on to the waterfall this was: a) A test timeout rather than suite timeout. b) A test timeout due to devserver rpc timeout, rather than just generic timeout.
,
Jun 3 2016
I accidentally deleted comment #5 and I have no idea what it used to say.
,
Jun 14 2016
Another mysterious abort. Looks like test was aborted after 15 minutes, though not sure by what. Per-test timeout? https://uberchromegw.corp.google.com/i/chromeos/builders/link-paladin/builds/25334 http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=66712142 Suite job logs: .. .. .. 06/13 23:11:28.123 INFO | server_job:0128| START 66712286-chromeos-test/chromeos2-row24-rack9-host11/hardware_StorageWearoutDetect hardware_StorageWearoutDetect timestamp=1465884597 localtime=Jun 13 23:09:57 06/13 23:11:28.123 INFO | server_job:0128| GOOD 66712286-chromeos-test/chromeos2-row24-rack9-host11/hardware_StorageWearoutDetect hardware_StorageWearoutDetect timestamp=1465884598 localtime=Jun 13 23:09:58 completed successfully 06/13 23:11:28.123 INFO | server_job:0128| END GOOD 66712286-chromeos-test/chromeos2-row24-rack9-host11/hardware_StorageWearoutDetect hardware_StorageWearoutDetect timestamp=1465884598 localtime=Jun 13 23:09:58 06/13 23:11:28.123 DEBUG| suite:1095| Adding status keyval for hardware_StorageWearoutDetect=66712286-chromeos-test 06/13 23:26:32.311 INFO | server_job:0128| START 66712254-chromeos-test/chromeos2-row24-rack9-host9 link-paladin/R53-8451.0.0-rc2/bvt-cq/logging_UserCrash_SERVER_JOB timestamp=1465884227 localtime=Jun 13 23:03:47 06/13 23:26:32.311 INFO | server_job:0128| ABORT 66712254-chromeos-test/chromeos2-row24-rack9-host9 link-paladin/R53-8451.0.0-rc2/bvt-cq/logging_UserCrash_SERVER_JOB timestamp=1465885474 localtime=Jun 13 23:24:34 06/13 23:26:32.311 INFO | server_job:0128| END ABORT 66712254-chromeos-test/chromeos2-row24-rack9-host9 link-paladin/R53-8451.0.0-rc2/bvt-cq/logging_UserCrash_SERVER_JOB timestamp=1465885474 localtime=Jun 13 23:24:34 .. ..
,
Jun 18 2016
,
Jun 21 2016
,
Sep 29 2017
The issue is obsoleted. |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by akes...@chromium.org
, Jun 2 2016