Issue metadata
Sign in to add a comment
|
Various R67 branch builders are failing BuildPackages for Chrome with an afdo failure |
||||||||||||||||||||||||
Issue description"Could not open profile: Unrecognized sample profile encoding format" example log: https://logs.chromium.org/v/?s=chromeos%2Fbb%2Fchromeos_release%2Fauron_yuna-release_release-R67-10575.B%2F11%2F%2B%2Frecipes%2Fsteps%2FBuildPackages__afdo_use_%2F0%2Fstdout ... chromeos-chrome-67.0.3396.17_rc-r1: FAILED: obj/base/allocator/tcmalloc/spinlock.o chromeos-chrome-67.0.3396.17_rc-r1: /home/chrome-bot/goma/gomacc x86_64-cros-linux-gnu-clang++ -B/usr/x86_64-pc-linux-gnu/x86_64-cros-linux-gnu/binutils-bin/2.27.0-gold -MMD -MF obj/base/allocator/tcmalloc/spinlock.o.d -DNO_HEAP_CHECK -DV8_DEPRECATION_WARNINGS -DUSE_UDEV -DUSE_AURA=1 -DUSE_NSS_CERTS=1 -DUSE_OZONE=1 -DFULL_SAFE_BROWSING -DSAFE_BROWSING_CSD -DSAFE_BROWSING_DB_LOCAL -DOFFICIAL_BUILD -DGOOGLE_CHROME_BUILD -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -DNO_UNWIND_TABLES -DCR_CLANG_REVISION=\"328716-2\" -DOS_CHROMEOS -DCR_SYSROOT_HASH=85ac8d5e0f6cff99fc323fd3d29cb73e2aa970e2 -DNDEBUG -DNVALGRIND -DDYNAMIC_ANNOTATIONS_ENABLED=0 -DTCMALLOC_DONT_REPLACE_SYSTEM_ALLOC -I../../../../../../../home/chrome-bot/chrome_root/src/base/allocator -I../../../../../../../home/chrome-bot/chrome_root/src/third_party/tcmalloc/chromium/src/base -I../../../../../../../home/chrome-bot/chrome_root/src/third_party/tcmalloc/chromium/src -I../../../../../../../home/chrome-bot/chrome_root/src -Igen -fno-strict-aliasing -fmerge-all-constants -fno-unwind-tables -fno-asynchronous-unwind-tables -fPIC -pipe -pthread -fcolor-diagnostics -no-canonical-prefixes -flto=thin -fwhole-program-vtables -m64 -march=x86-64 -fno-omit-frame-pointer -g2 -gsplit-dwarf -ggnu-pubnames -fvisibility=hidden -Wheader-hygiene -Wstring-conversion -Wtautological-overlap-compare -Wall -Wno-unused-variable -Wno-missing-field-initializers -Wno-unused-parameter -Wno-c++11-narrowing -Wno-covered-switch-default -Wno-unneeded-internal-declaration -Wno-inconsistent-missing-override -Wno-undefined-var-template -Wno-nonportable-include-path -Wno-address-of-packed-member -Wno-unused-lambda-capture -Wno-user-defined-warnings -Wno-enum-compare-switch -Wno-null-pointer-arithmetic -Wno-ignored-pragma-optimize -Wno-reorder -Wno-unused-function -Wno-unused-local-typedefs -Wno-unused-private-field -Wno-sign-compare -Wno-unused-result -O2 -fno-ident -fdata-sections -ffunction-sections -std=gnu++14 -fno-exceptions -fno-rtti --sysroot=../../../../../../../build/auron_yuna -fvisibility-inlines-hidden -pipe -pipe -pipe -march=corei7 -fno-split-dwarf-inlining -fdebug-info-for-profiling -D__google_stl_debug_vector=1 -Wno-unknown-warning-option -stdlib=libc++ -fprofile-sample-use=/build/auron_yuna/tmp/portage/chromeos-base/chromeos-chrome-67.0.3396.17_rc-r1/work/afdo/broadwell_R67-3383.0-1524479866.afdo -Wno-error -c ../../../../../../../home/chrome-bot/chrome_root/src/third_party/tcmalloc/chromium/src/base/spinlock.cc -o obj/base/allocator/tcmalloc/spinlock.o chromeos-chrome-67.0.3396.17_rc-r1: [0;1;31merror: [0m/build/auron_yuna/tmp/portage/chromeos-base/chromeos-chrome-67.0.3396.17_rc-r1/work/afdo/broadwell_R67-3383.0-1524479866.afdo: Could not open profile: Unrecognized sample profile encoding format[0m chromeos-chrome-67.0.3396.17_rc-r1: 1 error generated. ... These don't seem to follow a particular SoC, for example yuna failed but paine passed, and they should be identical from Chrome's perspective, we are trying a clobber. Luis, is this in your jurisdiction?
,
Apr 24 2018
The profile is compressed 3 times: gs://chromeos-prebuilt/afdo-job/cwp/chrome/broadwell/R67-3383.0-1524479866.afdo.xz $ file R67-3383.0-1524479866.afdo.xz R67-3383.0-1524479866.afdo.xz: XZ compressed data $ xz -d R67-3383.0-1524479866.afdo.xz && file R67-3383.0-1524479866.afdo R67-3383.0-1524479866.afdo: XZ compressed data $ mv R67-3383.0-1524479866.afdo R67-3383.0-1524479866.afdo.xz && xz -d R67-3383.0-1524479866.afdo.xz && file R67-3383.0-1524479866.afdo R67-3383.0-1524479866.afdo: XZ compressed data $ mv R67-3383.0-1524479866.afdo R67-3383.0-1524479866.afdo.xz && xz -d R67-3383.0-1524479866.afdo.xz && file R67-3383.0-1524479866.afdo R67-3383.0-1524479866.afdo: data
,
Apr 24 2018
I think this is my fault. I looked at the file and it is xz compressed three times. I added a change to copy the profile to different folders for an experiment, but it compressed it again on each copy. I will manually copy the good profile to each location to unblock the builders and I will fix my code.
,
Apr 24 2018
First part is done. I copied the "good" R67 and R68 profiles to the uarch subfolders.
,
Apr 24 2018
Second part is fixed in b/78527508. This will impact next week's profiles. Marking this as fixed (work complete), but waiting for confirmation that the builds succeeded after the manual profile copying.
,
Apr 24 2018
Latest Eve bug still failed: https://logs.chromium.org/v/?s=chromeos%2Fbb%2Fchromeos_release%2Feve-release_release-R67-10575.B%2F13%2F%2B%2Frecipes%2Fsteps%2FBuildPackages__afdo_use_%2F0%2Fstdout
,
Apr 24 2018
Copying the relevant lines, it looks like it failed to fetch the new profile because the checksum was different. But I don't know why it expects the same checksum. The file is modified. I will let Ting-Yuan handle this one. chromeos-chrome-67.0.3396.17_rc-r1: 16:39:13: INFO: RunCommand: /mnt/host/source/.cache/common/gsutil_4.30.tar.gz/gsutil/gsutil -o 'Boto:num_retries=10' cp -v -- gs://chromeos-prebuilt/afdo-job/cwp/chrome/airmont/R67-3383.0-1524479866.afdo.xz /var/cache/chromeos-cache/distfiles/target/airmont_R67-3383.0-1524479866.afdo.xz.tmp chromeos-chrome-67.0.3396.17_rc-r1: !!! Fetched file: airmont_R67-3383.0-1524479866.afdo.xz VERIFY FAILED! chromeos-chrome-67.0.3396.17_rc-r1: !!! Reason: Filesize does not match recorded size chromeos-chrome-67.0.3396.17_rc-r1: !!! Got: 3899896 chromeos-chrome-67.0.3396.17_rc-r1: !!! Expected: 3900140 chromeos-chrome-67.0.3396.17_rc-r1: Refetching... File renamed to '/var/cache/chromeos-cache/distfiles/target/airmont_R67-3383.0-1524479866.afdo.xz._checksum_failure_.e5q0le' chromeos-chrome-67.0.3396.17_rc-r1: chromeos-chrome-67.0.3396.17_rc-r1: !!! Couldn't download 'airmont_R67-3383.0-1524479866.afdo.xz'. Aborting. chromeos-chrome-67.0.3396.17_rc-r1: * Fetch failed for 'chromeos-base/chromeos-chrome-67.0.3396.17_rc-r1', Log file: chromeos-chrome-67.0.3396.17_rc-r1: * '/build/eve/tmp/portage/logs/chromeos-base:chromeos-chrome-67.0.3396.17_rc-r1:20180424-233912.log' chromeos-chrome-67.0.3396.17_rc-r1: >>> Failed to emerge chromeos-base/chromeos-chrome-67.0.3396.17_rc-r1 for /build/eve/, Log file: chromeos-chrome-67.0.3396.17_rc-r1: >>> '/build/eve/tmp/portage/logs/chromeos-base:chromeos-chrome-67.0.3396.17_rc-r1:20180424-233912.log' chromeos-chrome-67.0.3396.17_rc-r1: chromeos-chrome-67.0.3396.17_rc-r1: * Messages for package chromeos-base/chromeos-chrome-67.0.3396.17_rc-r1 merged to /build/eve/: chromeos-chrome-67.0.3396.17_rc-r1: chromeos-chrome-67.0.3396.17_rc-r1: * Fetch failed for 'chromeos-base/chromeos-chrome-67.0.3396.17_rc-r1', Log file: chromeos-chrome-67.0.3396.17_rc-r1: * '/build/eve/tmp/portage/logs/chromeos-base:chromeos-chrome-67.0.3396.17_rc-r1:20180424-233912.log' === Complete: job chromeos-chrome-67.0.3396.17_rc-r1 (0m1.7s) === Failed chromeos-base/chromeos-chrome-67.0.3396.17_rc-r1 (in 0m1.7s). Your build has failed.
,
Apr 25 2018
We are going to chump this as the change in gs breaks ALL r67 and r68 builders. Prepared a CL to update the checksum in-place: https://chromium-review.googlesource.com/c/chromiumos/overlays/chromiumos-overlay/+/1026888 Tryjobs are running: http://cros-goldeneye/chromeos/healthmonitoring/buildDetails?buildbucketId=8948319355358189200 http://cros-goldeneye/chromeos/healthmonitoring/buildDetails?buildbucketId=8948319044513216416
,
Apr 25 2018
,
Apr 25 2018
if you have some process that might clobber files in the bucket, that must be fixed. breaking existing manifests like this is entirely unacceptable. if the files are being uploaded by hand and devs have the ability to crush them, that sounds like a glaring problem. i've turned on versioning in the GS bucket so that at least if it happens again, we still have the files there.
,
Apr 25 2018
TL;DR: Looks like that we are super lucky and Chrome-PFQs updated the profiles for us, 5 minutes ago. Full story: In case the in-place fixing doesn't work, gmx@ uploaded the same profiles with their names/timestamps bumped. Our backup plan is to manually update the ebuilds and manifest because it can be several days before PFQs pick them up. Luckily, it just happened 5 minutes ago. (I found this because the CLs in #8 get conflict with HEADs, which were updated by the PFQs.) I'll keep an eye on the release builders before marking this verified.
,
Apr 25 2018
[Auto-generated comment by a script] We noticed that this issue is targeted for M-67; it appears the fix may have landed after branch point, meaning a merge might be required. Please confirm if a merge is required here - if so add Merge-Request-67 label, otherwise remove Merge-TBD label. Thanks.
,
Apr 25 2018
if you have the original files, then restore them on the GS bucket, and there's no need to modify the Manifest files. that's _strongly_ preferred to making any changes in the git repos.
,
Apr 25 2018
Sadly, I don't have them. But they were generated as successive xz compressions of the good profile (xz compressed once). The compression is done using a google3 Go library. I can try to recreate the process starting from the good profile and see if I get the same hashes. But it will be tomorrow.
,
Apr 25 2018
if we have to make changes anyways w/backporting of CLs, just add new files and update the ebuilds to point to those i wonder if we can crawl the bot's distfiles cache to see if they have good copies
,
Apr 25 2018
I fully agree. In hindsight it is all logical. I spend 99% of my life in google3 where I don't deal with Manifests and multiple repos. It is one unified repo under version control. I for one, don't know where the distfiles cache is.
,
Apr 25 2018
Good news. I was able to recreate the bad profiles with matching SHA256 hashes, and I uploaded them back under their old names. I don't know if anybody can validate the manifests for an earlier commit before the chrome uprev.
,
Apr 25 2018
R67 eve and auron_yuna builds look good. Their chromeos-chrome have been building for more than 60 mins without failure. https://uberchromegw.corp.google.com/i/chromeos_release/builders/auron_yuna-release%20release-R67-10575.B/builds/14 https://uberchromegw.corp.google.com/i/chromeos_release/builders/eve-release%20release-R67-10575.B/builds/14
,
Apr 25 2018
laszio@, did Chrome-PFQs update on R68 as well? R67 looks good, but R68 has lots of red still. The profiles have been bumped on R68 as well, so we could try to manually update the ebuilds there.
,
Apr 25 2018
Yes, Chrome-PFQs also updated ebuilds on R68. I checked the logs of a few boards and the failure caused by this bug all disappeared. R68 failures happen in a later stage (installation) of Chrome build than this bug. They are due to another bug: https://bugs.chromium.org/p/chromium/issues/detail?id=836296
,
Apr 25 2018
Thanks again for your help on this last night; I'm updating the labels to show the merge was approved (per #12 and IMs). The builds are looking a lot better this AM....
,
Apr 26 2018
Eve R68 builders are failing like this now. ... chromeos-chrome-68.0.3405.0_rc-r1: FAILED: obj/base/allocator/tcmalloc/abort.o chromeos-chrome-68.0.3405.0_rc-r1: /home/chrome-bot/goma/gomacc x86_64-cros-linux-gnu-clang++ -B/usr/x86_64-pc-linux-gnu/x86_64-cros-linux-gnu/binutils-bin/2.27.0-gold -MMD -MF obj/base/allocator/tcmalloc/abort.o.d -DNO_HEAP_CHECK -DV8_DEPRECATION_WARNINGS -DUSE_UDEV -DUSE_AURA=1 -DUSE_NSS_CERTS=1 -DUSE_OZONE=1 -DFULL_SAFE_BROWSING -DSAFE_BROWSING_CSD -DSAFE_BROWSING_DB_LOCAL -DOFFICIAL_BUILD -DGOOGLE_CHROME_BUILD -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -DNO_UNWIND_TABLES -DCR_CLANG_REVISION=\"329921-1\" -DOS_CHROMEOS -DCR_SYSROOT_HASH=4e7db513b0faeea8fb410f70c9909e8736f5c0ab -DNDEBUG -DNVALGRIND -DDYNAMIC_ANNOTATIONS_ENABLED=0 -DTCMALLOC_DONT_REPLACE_SYSTEM_ALLOC -I../../../../../../../home/chrome-bot/chrome_root/src/base/allocator -I../../../../../../../home/chrome-bot/chrome_root/src/third_party/tcmalloc/chromium/src/base -I../../../../../../../home/chrome-bot/chrome_root/src/third_party/tcmalloc/chromium/src -I../../../../../../../home/chrome-bot/chrome_root/src -Igen -fno-strict-aliasing -fmerge-all-constants -fno-unwind-tables -fno-asynchronous-unwind-tables -fPIC -pipe -pthread -fcolor-diagnostics -no-canonical-prefixes -flto=thin -fwhole-program-vtables -m64 -march=x86-64 -fno-omit-frame-pointer -g2 -gsplit-dwarf -ggnu-pubnames -fvisibility=hidden -Wheader-hygiene -Wstring-conversion -Wtautological-overlap-compare -Wall -Wno-unused-variable -Wno-missing-field-initializers -Wno-unused-parameter -Wno-c++11-narrowing -Wno-covered-switch-default -Wno-unneeded-internal-declaration -Wno-inconsistent-missing-override -Wno-undefined-var-template -Wno-nonportable-include-path -Wno-address-of-packed-member -Wno-unused-lambda-capture -Wno-user-defined-warnings -Wno-enum-compare-switch -Wno-null-pointer-arithmetic -Wno-ignored-pragma-optimize -Wno-return-std-move -Wno-reorder -Wno-unused-function -Wno-unused-local-typedefs -Wno-unused-private-field -Wno-sign-compare -Wno-unused-result -O2 -fno-ident -fdata-sections -ffunction-sections -std=gnu++14 -fno-exceptions -fno-rtti --sysroot=../../../../../../../build/eve -fvisibility-inlines-hidden -pipe -pipe -pipe -march=corei7 -fno-split-dwarf-inlining -fdebug-info-for-profiling -D__google_stl_debug_vector=1 -Wno-unknown-warning-option -stdlib=libc++ -fprofile-sample-use=/build/eve/tmp/portage/chromeos-base/chromeos-chrome-68.0.3405.0_rc-r1/work/afdo/broadwell_R68-3383.0-1524480987.afdo -Wno-error -c ../../../../../../../home/chrome-bot/chrome_root/src/third_party/tcmalloc/chromium/src/base/abort.cc -o obj/base/allocator/tcmalloc/abort.o chromeos-chrome-68.0.3405.0_rc-r1: [0;1;31merror: [0m/build/eve/tmp/portage/chromeos-base/chromeos-chrome-68.0.3405.0_rc-r1/work/afdo/broadwell_R68-3383.0-1524480987.afdo: Could not open profile: Unrecognized sample profile encoding format[0m chromeos-chrome-68.0.3405.0_rc-r1: 1 error generated. ... https://logs.chromium.org/v/?s=chromeos%2Fbb%2Fchromeos%2Feve-release%2F1560%2F%2B%2Frecipes%2Fsteps%2FBuildPackages__afdo_use_%2F0%2Fstdout Do we expect this to be fixed on a Chrome uprev if the Chrome PFQ passes? If we think this is really fixed we can close it again, but this does not look like the trace in https://bugs.chromium.org/p/chromium/issues/detail?id=836296
,
Apr 26 2018
Yes, this will be fixed by a Chrome uprev, I've noticed that Chrome has been at "68.0.3405.0" on R68 since before this issue was opened. We bumped the profiles after that time. It requires a Chrome uprev to pick them up, or a manual CL.
,
Apr 26 2018
Ok, then we can close this again, there should be some other bug for the PFQ to pass.
,
Apr 30 2018
This issue has been approved for a merge. Please merge the fix to any appropriate branches as soon as possible! If all merges have been completed, please remove any remaining Merge-Approved labels from this issue. Thanks for your time! To disable nags, add the Disable-Nags label. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Apr 30 2018
|
|||||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||||
Comment 1 by llozano@chromium.org
, Apr 24 2018Components: Tools>ChromeOS-Toolchain
Labels: -Pri-2 Pri-1
Owner: laszio@chromium.org
Status: Assigned (was: Untriaged)