New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 872758 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner: ----
Closed: Aug 11
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux , Windows , Mac
Pri: 0
Type: Bug

Blocked on:
issue 873102



Sign in to add a comment

sizes step failing on Google Chrome {Win7, Mac, Linux64} with HTTP {400,500}

Project Member Reported by ellyjo...@chromium.org, Aug 9

Issue description

Sample run: https://ci.chromium.org/buildbot/chromium.chrome/Google%20Chrome%20Win/35197

Log:

Sending result 2 of 2 to dashboard.
Confused: 12 files were deleted from c:\users\chrome~1\appdata\local\temp during the test run
Error uploading chartjson data: Discarding JSON, error:
Traceback (most recent call last):
  File "C:\b\rr\tmprkihr4\rw\checkout\scripts\slave\results_dashboard.py", line 493, in _SendResultsJson
    urllib2.urlopen(req)
  File "C:\b\depot_tools\win_tools-2_7_6_bin\python\bin\lib\urllib2.py", line 127, in urlopen
    return _opener.open(url, data, timeout)
  File "C:\b\depot_tools\win_tools-2_7_6_bin\python\bin\lib\urllib2.py", line 410, in open
    response = meth(req, response)
  File "C:\b\depot_tools\win_tools-2_7_6_bin\python\bin\lib\urllib2.py", line 523, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\b\depot_tools\win_tools-2_7_6_bin\python\bin\lib\urllib2.py", line 448, in error
    return self._call_chain(*args)
  File "C:\b\depot_tools\win_tools-2_7_6_bin\python\bin\lib\urllib2.py", line 382, in _call_chain
    result = func(*args)
  File "C:\b\depot_tools\win_tools-2_7_6_bin\python\bin\lib\urllib2.py", line 531, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 400: Bad Request
step returned non-zero exit code: 87
 
Labels: OS-Linux OS-Mac
Widening this because this step is also failing on Google Chrome Mac and Google Chrome Linux64.
Labels: -Pri-1 Pri-0
These are now failing on all official bots. Raising to Pri-0.
Components: -Infra Speed>Dashboard
Labels: Infra-Troopers
Status: Available (was: Untriaged)
There are a bunch of issues with the general chromium tree right now. This looks like a perf issue with their server? I'll assign to them.
Cc: eakuefner@chromium.org benjhayden@chromium.org dtu@chromium.org
Cc: grt@chromium.org
Searching the perf dashboard logs for "status:400" brings up a lot of errors about the revisions being uploaded:
https://pantheon.corp.google.com/logs/viewer?project=chromeperf&minLogLevel=0&expandAll=false&timestamp=2018-08-09T16:23:53.720000000Z&customFacets=&limitCustomFacetWidth=true&dateRangeStart=2018-08-09T15:23:59.245Z&dateRangeEnd=2018-08-09T16:23:59.245Z&interval=PT1H&resource=gae_app%2Fmodule_id%2Fdefault&logName=projects%2Fchromeperf%2Flogs%2Fappengine.googleapis.com%252Frequest_log&scrollTimestamp=2018-08-09T16:23:35.926207000Z&filters=status:400

Invalid ID (revision) 1533831815; compared to previous ID 581887, it was larger or smaller by too much

I think it's related to problems with commit position?

Do these builders re-try failed uploads?
Cc: sullivan@chromium.org
> I think it's related to problems with commit position?

See bug 872729 where ppl force pushed commits w/o the usual commit position header. Maybe related?
#8 bug deduped into https://bugs.chromium.org/p/chromium/issues/detail?id=872722 def looks related.

There haven't been any new chromeperf deployments recently (latest was the 7th) so it's probably not a change on that side.
https://docs.google.com/document/d/11UwPvlhjK5DLKSOBOpF5fRsT8zx1urEIDLavuAcTotY/edit

TL;DR looks there's a gerrit plugin failure that allowed CLs to land without going through CQ
The perf dashboard looks like it's rejecting the bad revisions intentionally. (Hooray!) When the gerrit plugin is fixed, these benchmarks will go back to providing commit positions instead of unix timestamps, and we don't want the perf dashboard's revisions to go backwards. Any objections to letting the perf dashboard continue to reject the bad revisions?

IIUC, the bots will keep retrying to upload the bad revisions even after the gerrit plugin is fixed, so we need to manually purge the data from the bots. Is that correct? Does anybody know how to do that?
The bots *should* only retry on 5XX (transient) errors and not 4XX (permanent) errors. I'm not sure where the recipe for official sizes is to check though!
Judging from the logs, it doesn't look like they retry after receiving a 400 response code.  One bad request failure and they stop.

However, on subsequent builds do the bots try to upload any previously attempted files or data that might still be around?
The bots usually have a step that retries previous attempts, but only if the response was a 5XX.
Summary: sizes step failing on Google Chrome {Win7, Mac, Linux64} with HTTP 400 (was: sizes step failing on Google Chrome Win7 with HTTP 400)
Summary: sizes step failing on Google Chrome {Win7, Mac, Linux64} with HTTP {400,500} (was: sizes step failing on Google Chrome {Win7, Mac, Linux64} with HTTP 400)
These 3 bots started failing constantly from

https://ci.chromium.org/buildbot/chromium.chrome/Google%20Chrome%20Win/35240

Sending result 1 of 2 to dashboard.
Confused: 10 files were deleted from c:\users\chrome~1\appdata\local\temp during the test run
Error while uploading chartjson data: Traceback (most recent call last):
  File "C:\b\rr\tmpec7c4v\rw\checkout\scripts\slave\results_dashboard.py", line 493, in _SendResultsJson
    urllib2.urlopen(req)
  File "C:\b\depot_tools\win_tools-2_7_6_bin\python\bin\lib\urllib2.py", line 127, in urlopen
    return _opener.open(url, data, timeout)
  File "C:\b\depot_tools\win_tools-2_7_6_bin\python\bin\lib\urllib2.py", line 410, in open
    response = meth(req, response)
  File "C:\b\depot_tools\win_tools-2_7_6_bin\python\bin\lib\urllib2.py", line 523, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\b\depot_tools\win_tools-2_7_6_bin\python\bin\lib\urllib2.py", line 448, in error
    return self._call_chain(*args)
  File "C:\b\depot_tools\win_tools-2_7_6_bin\python\bin\lib\urllib2.py", line 382, in _call_chain
    result = func(*args)
  File "C:\b\depot_tools\win_tools-2_7_6_bin\python\bin\lib\urllib2.py", line 531, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 500: Internal Server Error
step returned non-zero exit code: 87

https://ci.chromium.org/buildbot/chromium.chrome/Google%20Chrome%20Linux%20x64/34654
https://ci.chromium.org/buildbot/chromium.chrome/Google%20Chrome%20Mac/35784





Blockedon: 873102
5XX errors should be resolved on the next run with the fix for 873102.
Status: Fixed (was: Available)
Marking as fixed, this seems to be better?

Sign in to add a comment