Debug symbols failing for M71 and M72/ToT |
|||||||
Issue descriptionDebug symbols failing for a number of M71 boards with the last two M71 builds and recent M72 ToT Builds. M71: 2018-10-24 00:51 11151.11.0 71.0.3578.21 2018-10-23 00:30 11151.10.0 71.0.3578.18 M72: Recent as well Also note that it's not failing for all boards this time, but for most.
,
Oct 24
As a person going through this process for the first few times, is there an alert that announces that the debug symbol upload process is failing? It seems like we usually (so far) we find this out at release time when looking at a dashboard. An alert at a minimum and ideally a bug automatic filing would be helpful.
,
Oct 24
Over to current oncaller.
,
Oct 24
Is this the problem with the symbols server crashing during symbol upload? Looking for logs of the failure. Sample build: https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?buildbucketId=8931791811345421664 DebugInfoCheck: This is only a warning, but seems relevant. 06:46:01: INFO: RunCommand: cros_sdk -- debug_info_test /build/eve/usr/lib/debug /build/eve/usr/lib/debug/opt/intel/fw_parser.debug failed check: check_exist: check_debug_info [1;33m07:07:20: WARNING: Traceback (most recent call last): File "/b/swarming/w/ir/cache/cbuild/repository/chromite/cbuildbot/stages/generic_stages.py", line 702, in Run self.PerformStage() File "/b/swarming/w/ir/cache/cbuild/repository/chromite/cbuildbot/stages/test_stages.py", line 592, in PerformStage cros_build_lib.RunCommand(cmd, enter_chroot=True) File "/b/swarming/w/ir/cache/cbuild/repository/chromite/lib/cros_build_lib.py", line 647, in RunCommand raise RunCommandError(msg, cmd_result) RunCommandError: return code: 1; command: cros_sdk -- debug_info_test /build/eve/usr/lib/debug Debug Symbols: The only real failure I see is this: 07:35:04: INFO: Uploading symbol_file: chrome/5215AF1973DA1BBE933AC2F407D00CD80/chrome.sym [1;33m08:16:52: WARNING: could not upload: chrome.sym: HTTP 400: Bad Request[0m Everything else appears to have uploaded normally. I believe that means this is the problem which we have already escalated to the crash server team, but I don't have a link to the relevant bugs.
,
Oct 24
Re #2) No alerts. We intentionally treat these failures as warnings only, because the crash server upload process is much to flaky to consider this a real failure. I would love to revamp the upload process to be MUCH faster and more efficient. We upoad the symbols tarball to GS in about a minute, but spend about an hour attempting to upload them to the crash server.
,
Oct 24
Hrm, I guess the difference in reports boils down to one where it was 500's before and now it's 400's. I think we should just reopen b/117235960 .
,
Oct 24
I reopened that bug and asked for feedback.
,
Oct 25
b/117235960 is reporting this as resolved, and confirmed via M72: 11191.0.0 / 72.0.3589.0 We'll need to backfill symbols for 11151.11.0 / 71.0.3578.21 since it's the DEV / Beta Candidate, however. Let's keep this open until the backfill is complete. Thanks
,
Oct 25
Kevin, the backfill instructions are in the DebugSymbolsUpload stage logs for each of the failed uploads that you are interested in. Can you run those on your workstation?
,
Oct 25
Respectfully, this was an infra issue so hoping infra can resolve. We're quite time pressured on other aspects of the release at the moment... Thanks
,
Oct 25
If this is an infra failure, would this be in the CI Bobby's jurisdiction?
,
Oct 25
We own most build related infra. Debug symbols is a bit on the fuzzy side. I would say that this is something that is scripted at the build level, but not at the release level. Having a script for that would be really helpful for situations like this.
,
Oct 25
vapier@, can you assist with the backfill as you did last time? Critical we get these in place. For beta too when we go to push that.
,
Oct 25
Sample instructions: 08:07:01: NOTICE: upload_symbols --failed-list gs://chromeos-image-archive/expresso-release/R72-11185.0.0/failed_upload_symbols.list gs://chromeos-image-archive/expresso-release/R72-11185.0.0/debug_breakpad.tar.xz
,
Oct 25
This can be bulk uploaded with a command like: gsutil ls gs://chromeos-image-archive/*-release/R71-11151.11.0/debug_breakpad.tar.xz | xargs -n 20 bin/upload_symbols I would expect that to take somewhere between 8 and 150 hours to run.
,
Oct 25
Note: I'm running that now.
,
Oct 25
Slightly revised command: gsutil ls gs://chromeos-image-archive/*-release/R71-11151.11.0/debug_breakpad.tar.xz | xargs bin/upload_symbols --dedupe --yes
,
Oct 26
Don, should I expect to see this resolve / update in GoldenEye once it completes? https://cros-goldeneye.corp.google.com/chromeos/console/viewRelease?releaseName=M71-DEV-CHROMEOS-7 Thx
,
Oct 26
Nope. That's based only on the success of the upload by the build. BTW: The upload is still running. The current rate looks very, very roughly like about 60 hours total run time.
,
Oct 30
Still backfilling, I assume? Is it progressing, stuck,.. ?
,
Oct 30
Still back filling, but it seems to have finished uploading symbols for Chrome and be back filling a bunch of small system binaries.
,
Oct 30
For what it's worth, it's going at roughly one file per 7 seconds.
,
Oct 30
Got it, should I expect to see the symbols when I view the RC in GoldenEye? https://cros-goldeneye.corp.google.com/chromeos/console/viewRelease?releaseName=M71-DEV-CHROMEOS-7
,
Oct 30
No, that's based on build results, not based on the symbols actually being uploaded.
,
Oct 30
Thanks; nice to confirm :-)
,
Oct 30
I'm going to remove as a blocker since the issue is resolved and the backfill is progressing.
,
Dec 12
|
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by kbleicher@google.com
, Oct 24