New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 616848 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner:
Last visit > 30 days ago
Closed: Sep 2017
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

mysterious SERVER_JOB aborts

Project Member Reported by akes...@chromium.org, Jun 2 2016

Issue description

Tracking bug to collect occurences. These look like some kind of timeout, but I don't know what is setting their timeout. They seem to timeout after ~45 minutes byt the requested suite timeout seems to be 90.
 
Cc: dshi@chromium.org
dshi theory -- these are test timeouts due to devserver overload. see logs in https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/65302062-chromeos-test/chromeos4-row2-rack3-host9/debug/

Comment 5 Deleted

Comment 6 by dshi@chromium.org, Jun 2 2016

Attached is the CPU% of the devserver causing that issue. The load around 5/31 10PM is at 100%.

I added 2 more devserver to chromeos4 (100.107.x.x subnet). It's behaving much better now. There will be some followup work to choose a devserver with less gsutil process running.
devload.png
210 KB View Download
Cc: -dshi@chromium.org
Labels: -current-issue
Owner: dshi@chromium.org
seems that dan is handling, assigning to him
Ok, the fix to the outage is one thing. Let's also fix the logging somehow.

How can we communicate better on to the waterfall this was:
a) A test timeout rather than suite timeout.
b) A test timeout due to devserver rpc timeout, rather than just generic timeout.
I accidentally deleted comment #5 and I have no idea what it used to say.
Another mysterious abort. Looks like test was aborted after 15 minutes, though not sure by what. Per-test timeout?

https://uberchromegw.corp.google.com/i/chromeos/builders/link-paladin/builds/25334
http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=66712142

Suite job logs:
..
..
..
06/13 23:11:28.123 INFO |        server_job:0128| START	66712286-chromeos-test/chromeos2-row24-rack9-host11/hardware_StorageWearoutDetect	hardware_StorageWearoutDetect	timestamp=1465884597	localtime=Jun 13 23:09:57	
06/13 23:11:28.123 INFO |        server_job:0128| 	GOOD	66712286-chromeos-test/chromeos2-row24-rack9-host11/hardware_StorageWearoutDetect	hardware_StorageWearoutDetect	timestamp=1465884598	localtime=Jun 13 23:09:58	completed successfully
06/13 23:11:28.123 INFO |        server_job:0128| END GOOD	66712286-chromeos-test/chromeos2-row24-rack9-host11/hardware_StorageWearoutDetect	hardware_StorageWearoutDetect	timestamp=1465884598	localtime=Jun 13 23:09:58	
06/13 23:11:28.123 DEBUG|             suite:1095| Adding status keyval for hardware_StorageWearoutDetect=66712286-chromeos-test
06/13 23:26:32.311 INFO |        server_job:0128| START	66712254-chromeos-test/chromeos2-row24-rack9-host9	link-paladin/R53-8451.0.0-rc2/bvt-cq/logging_UserCrash_SERVER_JOB	timestamp=1465884227	localtime=Jun 13 23:03:47	
06/13 23:26:32.311 INFO |        server_job:0128| 	ABORT	66712254-chromeos-test/chromeos2-row24-rack9-host9	link-paladin/R53-8451.0.0-rc2/bvt-cq/logging_UserCrash_SERVER_JOB	timestamp=1465885474	localtime=Jun 13 23:24:34	
06/13 23:26:32.311 INFO |        server_job:0128| END ABORT	66712254-chromeos-test/chromeos2-row24-rack9-host9	link-paladin/R53-8451.0.0-rc2/bvt-cq/logging_UserCrash_SERVER_JOB	timestamp=1465885474	localtime=Jun 13 23:24:34	
..
..

Labels: dut-health
Labels: -dut-health Hotlist-CrOS-DutHealth

Comment 13 by dshi@chromium.org, Sep 29 2017

Status: WontFix (was: Untriaged)
The issue is obsoleted.

Sign in to add a comment