MSAN failures on builders are too difficult to debug/fix |
||||||
Issue descriptionA patch I reviewed was reverted due to an MSAN failure ( crbug.com/785106 ), but I'm finding it really difficult to debug. Here are some of the gaps I see: * msan doesn't run in the CQ * When there's a failure on the bots, the stack traces don't have source files / lines - perhaps they need to be piped through asan_symbolize * Configuring a local build requires setting GYP_DEFINES, even though Chromium hasn't used GYP for over a year now * Configuring a local build requires using out/Release, even though GN is supposed to allow custom out directories with lots of different configurations (for example I use out/asan for my asan build) * The prebuilt instrumented libraries are only available on Trusty, which is deprecated at Google * The MSAN documentation says to email earthdok@, who left Google more than 2 years ago
,
Nov 17 2017
,
Nov 17 2017
My team maintains msan upstream (in LLVM) but we don't really support it in Chrome (no headcount for that). inferno@, WDYT?
,
Nov 17 2017
Oliver has plans to fix it in this quarter, so keeping it in his queue.
,
Nov 17 2017
msan is computationally expensive, so I'm not sure how easily it'll be to add to the CQ, especially when we're already capacity constrained on linux. We do need to figure out how to make the stack traces useful. I *think* that the GYP_DEFINES shouldn't be required at this point. I know I've fixed at least some of the places where this was needed, but I'm not sure if I've fixed all of them (i.e., we might still be missing things). I'd be surprised if you needed out/Release; when I was testing the failure, I didn't, but of course it was a pretty simple failure to test. i.e., dmazzoni, maybe you can run some of the errors you hit by me so I can double-check? kcc@, my understanding was that your team was on the hook to maintain all of the sanitizers in Chrome, and I'm surprised by the "no headcount" statement. It could easily be that I'm just wrong here, so let's talk about this off-thread. I need to make sure this is supported one way or another if we're running it :).
,
Nov 17 2017
dpranke@, happy to discuss in detail offline. We have 4 folks funded by chrome, but they are doing something else (CFI and KASAN/syzkaller). We do support all the sanitizers as tools, but we mostly stopped working on chrome integration.
,
Nov 17 2017
> msan is computationally expensive, so I'm not sure how easily it'll be to add to the CQ, especially when we're already capacity constrained on linux. Understood. I think a reasonable goal would be that there are optional msan try servers that match any msan coverage on the main waterfall, otherwise it's just too painful for developers to debug. > We do need to figure out how to make the stack traces useful. > I *think* that the GYP_DEFINES shouldn't be required at this point. I know I've fixed at least some of the places where this was needed, but I'm not sure if I've fixed all of them (i.e., we might still be missing things). This is from the MSAN documentation - https://www.chromium.org/developers/testing/memorysanitizer - if that's out of date, great. > I'd be surprised if you needed out/Release; when I was testing the failure, I didn't, but of course it was a pretty simple failure to test. Also from the documentation, but I didn't get that far so I don't know. It says it's for test expectations so it probably didn't matter for this particular repro. > i.e., dmazzoni, maybe you can run some of the errors you hit by me so I can double-check? Here's the one I can't get past because I upgraded to Rodete: FAILED: obj/third_party/instrumented_libraries/msan-chained-origins.txt python ../../third_party/instrumented_libraries/scripts/unpack_binaries.py msan-chained-origins /ssd1tb/gitchrome3/src/third_party/instrumented_libraries/binaries /ssd1tb/gitchrome3/src/out/Release/instrumented_libraries_prebuilt obj/third_party/instrumented_libraries Traceback (most recent call last): File "../../third_party/instrumented_libraries/scripts/unpack_binaries.py", line 44, in <module> sys.exit(main(*sys.argv[1:])) File "../../third_party/instrumented_libraries/scripts/unpack_binaries.py", line 29, in main os.path.join(archive_dir, get_archive_name(archive_prefix)), File "../../third_party/instrumented_libraries/scripts/unpack_binaries.py", line 18, in get_archive_name raise Exception("Supported Ubuntu versions: %s", str(supported_releases)) Exception: ('Supported Ubuntu versions: %s', "['trusty']")
,
Nov 17 2017
> Understood. I think a reasonable goal would be that there are optional > msan try servers that match any msan coverage on the main waterfall, > otherwise it's just too painful for developers to debug. We have linux_chromium_msan_rel_ng and linux_chromium_chromeos_msan_rel_ng, which match the two configs on chromium.memory. We don't have a matching bot for "WebKit Linux Trusty MSAN", but that bot actually needs to go away and we should just move the webkit_layout_tests over to the main bot. Generally speaking, every bot on a main waterfall is supposed to have at least an optional matching tryserver, but we're not at 100% yet. > This is from the MSAN documentation - https://www.chromium.org/developers/testing/memorysanitizer > - if that's out of date, great. I'll see if we can update that. I don't think the test expectations are sensitive to the directory name, but will check. I think the rodete failure is probably something we just haven't tested yet; because of the way we use the sysroots and instrumented libs, I *think* it'll just work, but I could easily be wrong.
,
Dec 27 2017
Does it mean that I can simply add the following two lines to unpack_binaries.py so I can debug MSAN build locally on my gLinux?
if release == 'rodete':
release = 'trusty'
It doesn't seem to work in my case.
,
Dec 28 2017
Do you have any more details about rodete failure? Is it a false error report from MSan, or a failure to download instrumented libraries, or something else? As far as I can see, GYP_DEFINES are no longer necessary. Instrumented libraries are downloaded automatically, unless you set checkout_configuration='small' in .gclient. Thank you, Dirk!
,
Jan 16 2018
,
Feb 14 2018
I tried the same as comment #9, but found that while I can build, I cannot execute browser_tests: katydek@vega:~/chromium/src$ GYP_DEFINES='msan=1 use_prebuilt_instrumented_libraries=1' gclient runhooks Running hooks: 100% (60/60), done. katydek@vega:~/chromium/src$ ninja -C out/msan browser_tests -l 1000 ninja: Entering directory `out/msan' [644/644] LINK ./browser_tests katydek@vega:~/chromium/src$ out/msan/browser_tests --gtest_filter="*TrayAccessibilityTest.ShowTrayIcon*" out/msan/browser_tests: error while loading shared libraries: libjpeg.so.8: cannot open shared object file: No such file or directory
,
Mar 5 2018
Please try the instructions at: https://sites.google.com/a/chromium.org/dev/developers/testing/memorysanitizer?pli=1#TOC-Running-on-other-distros-using-Docker for reproducing MSan issues locally. Please let me know if you run into any issues in bug 751218. AFAIK GYP_DEFINES and out/Release are no longer necessary (the docs should reflect this). I'm probably not a good owner for the rest of the issues you've listed though (symbolization on bots and CQ). Would someone from infra be able to help with those?
,
May 21 2018
Looks like msan builds on the bot now have stacks: https://logs.chromium.org/v/?s=chromium%2Fbb%2Fchromium.clang%2FToTLinuxMSan%2F2543%2F%2B%2Frecipes%2Fsteps%2Fcomponents_browsertests%2F0%2Flogs%2FPdfAccessibilityTreeTest.TestEmptyPDFPage%2F0
,
May 21 2018
Setting GYP_DEFINES is also no longer needed, see the checkout_instrumented_libraries bit in src/DEPS.
,
May 21 2018
The "out/Release" bit is according to the docs only needed for layout tests, so that's a blink thing, not an msan thing. I don't know if it's still needed. I removed the gyp parts from the msan docs.
,
Jan 11
Available, but no owner or component? Please find a component, as no one will ever find this without one. |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by dmazz...@chromium.org
, Nov 17 2017Status: Assigned (was: Untriaged)