Jumbo, ios-simulator, mac dbg builders on waterfall are out of space. |
|||||||
Issue descriptionFAILED: headless_browsertests python "../../build/toolchain/gcc_link_wrapper.py" --output="./headless_browsertests" -- ../../third_party/llvm-build/Release+Asserts/bin/clang++ -Wl,--fatal-warnings -fPIC -Wl,-z,noexecstack -Wl,-z,now -Wl,-z,relro -Wl,-z,defs -Wl,--as-needed -fuse-ld=lld -Wl,--icf=all -Wl,--color-diagnostics -m64 -Werror -Wl,-O2 -Wl,--gc-sections -rdynamic -nostdlib++ --sysroot=../../build/linux/debian_sid_amd64-sysroot -L../../build/linux/debian_sid_amd64-sysroot/usr/local/lib/x86_64-linux-gnu -Wl,-rpath-link=../../build/linux/debian_sid_amd64-sysroot/usr/local/lib/x86_64-linux-gnu -L../../build/linux/debian_sid_amd64-sysroot/lib/x86_64-linux-gnu -Wl,-rpath-link=../../build/linux/debian_sid_amd64-sysroot/lib/x86_64-linux-gnu -L../../build/linux/debian_sid_amd64-sysroot/usr/lib/x86_64-linux-gnu -Wl,-rpath-link=../../build/linux/debian_sid_amd64-sysroot/usr/lib/x86_64-linux-gnu -Wl,-rpath-link=. -Wl,--disable-new-dtags -o "./headless_browsertests" -Wl,--start-group @"./headless_browsertests.rsp" -Wl,--end-group -ldl -lpthread -lrt -lgmodule-2.0 -lgobject-2.0 -lgthread-2.0 -lglib-2.0 -lnss3 -lnssutil3 -lsmime3 -lplds4 -lplc4 -lnspr4 -lresolv -lgio-2.0 -lexpat -luuid -ldbus-1 -lXext -lX11 -lXcomposite -lXrender -lm -lX11-xcb -lxcb -lXcursor -lXdamage -lXfixes -lXi -lXtst -lXss -lXrandr -lasound -lz -lpangocairo-1.0 -lpango-1.0 -lcairo -lpci -latk-1.0 -latk-bridge-2.0 -latspi -lcups ld.lld: error: failed to open ./headless_browsertests: No space left on device clang: error: linker command failed with exit code 1 (use -v to see invocation) [49909/52113] LINK ./media_blink_unittests FAILED: media_blink_unittests python "../../build/toolchain/gcc_link_wrapper.py" --output="./media_blink_unittests" -- ../../third_party/llvm-build/Release+Asserts/bin/clang++ -Wl,--fatal-warnings -fPIC -Wl,-z,noexecstack -Wl,-z,now -Wl,-z,relro -Wl,-z,defs -Wl,--as-needed -fuse-ld=lld -Wl,--icf=all -Wl,--color-diagnostics -m64 -Werror -Wl,-O2 -Wl,--gc-sections -rdynamic -nostdlib++ --sysroot=../../build/linux/debian_sid_amd64-sysroot -L../../build/linux/debian_sid_amd64-sysroot/usr/local/lib/x86_64-linux-gnu -Wl,-rpath-link=../../build/linux/debian_sid_amd64-sysroot/usr/local/lib/x86_64-linux-gnu -L../../build/linux/debian_sid_amd64-sysroot/lib/x86_64-linux-gnu -Wl,-rpath-link=../../build/linux/debian_sid_amd64-sysroot/lib/x86_64-linux-gnu -L../../build/linux/debian_sid_amd64-sysroot/usr/lib/x86_64-linux-gnu -Wl,-rpath-link=../../build/linux/debian_sid_amd64-sysroot/usr/lib/x86_64-linux-gnu -Wl,-rpath-link=. -Wl,--disable-new-dtags -o "./media_blink_unittests" -Wl,--start-group @"./media_blink_unittests.rsp" -Wl,--end-group -ldl -lpthread -lrt -lgmodule-2.0 -lgobject-2.0 -lgthread-2.0 -lglib-2.0 -lnss3 -lnssutil3 -lsmime3 -lplds4 -lplc4 -lnspr4 -lexpat -luuid -lX11 -lX11-xcb -lxcb -lXcomposite -lXcursor -lXdamage -lXext -lXfixes -lXi -lXrender -lXtst -lXrandr -lresolv -lgio-2.0 -lpci -lXss -lasound -lm -lz -lpangocairo-1.0 -lpango-1.0 -lcairo -ldbus-1 ld.lld: error: failed to open ./media_blink_unittests: No space left on device clang: error: linker command failed with exit code 1 (use -v to see invocation) [49910/52113] ACTION //extensions/shell/installer/linux:app_shell_unstable_deb(//build/toolchain/linux:clang_x64) FAILED: chromium-app-shell-unstable_72.0.3591.0-1_amd64.deb python ../../build/gn_run_binary.py app_shell_installer/debian/build.sh -a x64 -b . -c unstable -d chromium -o . -s ../../build/linux/debian_sid_amd64-sysroot install: error writing '/b/swarming/w/ir/cache/builder/src/out/Release/app-shell-deb-staging-unstable//opt/chromium.org/app-shell-unstable/app_shell': No space left on device install: failed to extend '/b/swarming/w/ir/cache/builder/src/out/Release/app-shell-deb-staging-unstable//opt/chromium.org/app-shell-unstable/app_shell': No space left on device app_shell_installer/debian/build.sh failed with exit code 1 [49911/52113] LINK ./services_unittests FAILED: services_unittests python "../../build/toolchain/gcc_link_wrapper.py" --output="./services_unittests" -- ../../third_party/llvm-build/Release+Asserts/bin/clang++ -Wl,--fatal-warnings -fPIC -Wl,-z,noexecstack -Wl,-z,now -Wl,-z,relro -Wl,-z,defs -Wl,--as-needed -fuse-ld=lld -Wl,--icf=all -Wl,--color-diagnostics -m64 -Werror -Wl,-O2 -Wl,--gc-sections -rdynamic -nostdlib++ --sysroot=../../build/linux/debian_sid_amd64-sysroot -L../../build/linux/debian_sid_amd64-sysroot/usr/local/lib/x86_64-linux-gnu -Wl,-rpath-link=../../build/linux/debian_sid_amd64-sysroot/usr/local/lib/x86_64-linux-gnu -L../../build/linux/debian_sid_amd64-sysroot/lib/x86_64-linux-gnu -Wl,-rpath-link=../../build/linux/debian_sid_amd64-sysroot/lib/x86_64-linux-gnu -L../../build/linux/debian_sid_amd64-sysroot/usr/lib/x86_64-linux-gnu -Wl,-rpath-link=../../build/linux/debian_sid_amd64-sysroot/usr/lib/x86_64-linux-gnu -Wl,-rpath-link=. -Wl,--disable-new-dtags -o "./services_unittests" -Wl,--start-group @"./services_unittests.rsp" -Wl,--end-group -ldl -lpthread -lrt -lgmodule-2.0 -lgobject-2.0 -lgthread-2.0 -lglib-2.0 -lnss3 -lnssutil3 -lsmime3 -lplds4 -lplc4 -lnspr4 -lX11 -lX11-xcb -lxcb -lXcomposite -lXcursor -lXdamage -lXext -lXfixes -lXi -lXrender -lXtst -lexpat -luuid -lresolv -lgio-2.0 -lm -lXss -lXrandr -latk-1.0 -latk-bridge-2.0 -lpangocairo-1.0 -lpango-1.0 -lcairo -lpci -lasound -lz -ldbus-1 ld.lld: error: failed to open ./services_unittests: No space left on device clang: error: linker command failed with exit code 1 (use -v to see invocation) ninja: build stopped: subcommand failed. step returned non-zero exit code: 1 https://logs.chromium.org/logs/chromium/buildbucket/cr-buildbucket.appspot.com/8931764273626281984/+/steps/compile/0/stdout
,
Oct 24
Actually this may be more than Jumbo builder, ios-simulator just indicated out of space too.
thon ../../build/toolchain/mac/linker_driver.py xcrun lipo -create -output obj/ios/chrome/test/earl_grey/ios_chrome_smoke_egtests obj/ios/chrome/test/earl_grey/x64/ios_chrome_smoke_egtests ios_clang_x86/obj/ios/chrome/test/earl_grey/x86/ios_chrome_smoke_egtests
fatal error: /b/s/w/ir/cache/xcode_ios_10a254a.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/lipo: can't write to output file: obj/ios/chrome/test/earl_grey/ios_chrome_smoke_egtests.lipo (No space left on device)
Traceback (most recent call last):
File "../../build/toolchain/mac/linker_driver.py", line 229, in <module>
Main(sys.argv)
File "../../build/toolchain/mac/linker_driver.py", line 79, in Main
subprocess.check_call(compiler_driver_args)
File "/b/s/w/ir/cipd_bin_packages/lib/python2.7/subprocess.py", line 186, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['xcrun', 'lipo', '-create', '-output', 'obj/ios/chrome/test/earl_grey/ios_chrome_smoke_egtests', 'obj/ios/chrome/test/earl_grey/x64/ios_chrome_smoke_egtests', 'ios_clang_x86/obj/ios/chrome/test/earl_grey/x86/ios_chrome_smoke_egtests']' returned non-zero exit status 1
,
Oct 24
,
Oct 24
,
Oct 24
,
Oct 24
Think this one is oos too: https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Win10%20Tests%20x64%20%28dbg%29/3932
,
Oct 24
=>sergeyberezin who is looking at this while jbudorick@ gets back to his desk.
,
Oct 24
Will take the bug for now. Looking at the disk usage patterns across the fleet, it appears to be affecting only Mac, and primarily swarming bots in golo for 10.12.5 and 10.13.* versions (which is where we run most tests). The hardest hit seems 10.13.3, as the "max" disk space hits 100%: http://shortn/_pknNAce3PN (unlike all the other versions, which still have some room).
,
Oct 24
The hardest hit 20 bots: http://shortn/_YNZtcPE2IY Let's see what they serve...
,
Oct 24
BTW, it all seems to have started around 10:30am PDT.
,
Oct 24
ios-simulator: vm138-m9 vm155-m9 vm63-m9 vm143-m9 ... All of the 100% out of disk bots seem to be ios-simulator.
,
Oct 24
At least Jumbo builder healed itself, it seems: http://shortn/_Cwh9S4un3w (swarming probably cleared some caches once it hit 100% disk usage).
,
Oct 24
https://ci.chromium.org/p/chromium/builders/luci.chromium.try/ios-simulator/121499 is the first ios-simulator build that I see failing with out of disk space error, triggered at 2018-10-24 10:29 AM (PDT).
,
Oct 24
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Win10%20Tests%20x64%20%28dbg%29/3932 mentioned earlier fails in an isolated swarming task, not on the main machine, so I'd say it's different.
,
Oct 24
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/mac-dbg/1349 is related, triggered 2018-10-24 11:12 AM (PDT). But like Jumbo, it healed itself: http://shortn/_Mcf0xChXfd . So I think we're back to ios-simulator for now, which apparently has smaller disk space (because these are Mac VMs with only 250GB), and has no room to auto-heal.
,
Oct 24
PS. https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/mac-dbg/1348 (the previous build) started 2018-10-24 10:17 AM (PDT) and worked just fine. So the timing is really within these 13 or so min.
,
Oct 24
List of CLs that landed right before 10:30am PDT: 5ff40c0e4cb0 2018-10-24 17:27:17 +0000 [run_web_tests] Check for extra baselines 319faa6f9fc6 2018-10-24 17:21:44 +0000 Roll WebGL 6d2f3f4..0d55c88 a4ae44e5446e 2018-10-24 17:13:23 +0000 gfx:: Convert blit_unittest to the new shared memory API aec5d4fc49d5 2018-10-24 17:12:48 +0000 Enable QUIC connection migration tests for QUIC v99. * Make the QuicPacketCreator use an explicitly passed in QuicRandom * Make PATH_CHALLENGE and PATH_RESPONSE frames instigate acks. 41a1f2475c30 2018-10-24 17:07:57 +0000 Update V8 to version 7.2.101. 5f13cb27e8f7 2018-10-24 17:07:13 +0000 [Autofill Assistant] Server Payload is saved between requests. One of these must be the culprit. Something that may affect all Macs, and possibly other platforms, and add a disk usage at compile time.
,
Oct 24
Most suspected: https://crrev.com/c/1297570 Roll WebGL 6d2f3f4..0d55c88 17:21:44 https://crrev.com/c/1286894 [run_web_tests] Check for extra baselines 17:27:17
,
Oct 24
It appears most bots that failed due to disk space already erased their caches: http://shortn/_YkkyIUMvpV So it's likely not easy to see what exactly is taking up space :(
,
Oct 24
For the record, I doubt that my WebGL conformance roll caused this issue. The number of changes in the roll https://crrev.com/c/1297570 was small.
,
Oct 24
FYI, here's a link to the current *max* disk space across bots by OS and OS version: http://shortn/_nagzJhBPte This will be useful to evaluate if a revert worked.
,
Oct 24
Adding here FTR: it's also possible that something in build.git or other places got added around that time and contributed to the overall disk space. E.g. we install a number of things though gclient runhooks as well as various infra pre-task stages. Things to look for are various SDKs, for example.
,
Oct 24
Win tests issue above was due to https://chromium.googlesource.com/chromium/src/+/1c6f831f14152cf4ca9e23563757d61524487234 which unfortunately triggered a bunch of flaky tests. Otherwise things look like they're recovering?
,
Oct 24
Indeed, I don't see disk usage to be out of ordinary. Both https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/ios-simulator and https://ci.chromium.org/p/chromium/builders/luci.chromium.try/ios-simulator have recovered. https://ci.chromium.org/p/chromium/builders/luci.chromium.try/ios-simulator/121566 [2018-10-24 11:19 AM (PDT)] is the last CQ build with a clear out of disk space error; and https://ci.chromium.org/p/chromium/builders/luci.chromium.try/ios-simulator/121629 [2018-10-24 12:08 PM (PDT)] may be another one (less clear error). This specific outage is over, and I filed issue 898686 to track the high disk usage problem to reduce a chance of this happening again.
,
Oct 24
Both builds in #25 indeed shows a disk space problem on their respective machines, and auto-healing: http://shortn/_bvDFdTj9OG |
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by dalecur...@chromium.org
, Oct 24