Chromium with system_icu n C locale on Gentoo/ArchLinux, crash in url_formatter::IDNSpoofChecker::SimilarToTopDomains
Reported by
brule.he...@gmail.com,
Oct 7 2017
|
|||||||||||
Issue descriptionChrome Version : 61.0.3163.100 OS Version: Linux Gentoo What steps will reproduce the problem? 1. Just press C into the URL bar What is the expected result? Not crash What happens instead of that? Crash Please provide any additional information below. Attach a screenshot if possible. UserAgentString: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36 Thread 1 "chrome" received signal SIGSEGV, Segmentation fault. 0x000055555838403f in url_formatter::IDNSpoofChecker::SimilarToTopDomains(base::BasicStringPiece<std::__cxx11::basic_string<unsigned short, base::string16_char_traits, std::allocator<unsigned short> > >) () (gdb) bt #0 0x000055555838403f in url_formatter::hro::SimilarToTopDomains(base::BasicStringPiece<std::__cxx11::basic_string<unsigned short, base::string16_char_traits, std::allocator<unsigned short> > >) () #1 0x0000555558381ee2 in url_formatter::(anonymous namespace)::IDNToUnicodeWithAdjustments(base::BasicStringPiece<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::vector<base::OffsetAdjuster::Adjustment, std::allocator<base::OffsetAdjuster::Adjustment> >*) () #2 0x000055555838299e in url_formatter::(anonymous namespace)::HostComponentTransform::Execute(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<base::OffsetAdjuster::Adjustment, std::allocator<base::OffsetAdjuster::Adjustment> >*) const () #3 0x00005555583816de in url_formatter::(anonymous namespace)::AppendFormattedComponent(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, url::Component const&, url_formatter::(anonymous namespace)::AppendComponentTransform const&, std::__cxx11::basic_string<unsigned short, base::string16_char_traits, std::allocator<unsigned short> >*, url::Component*, std::vector<base::OffsetAdjuster::Adjustment, std::allocator<base::OffsetAdjuster::Adjustment> >*) () #4 0x0000555558382efd in url_formatter::FormatUrlWithAdjustments[abi:cxx11](GURL const&, unsigned int, unsigned int, url::Parsed*, unsigned long*, std::vector<base::OffsetAdjuster::Adjustment, std::allocator<base::OffsetAdjuster::Adjustment> >*) () #5 0x0000555558383d16 in url_formatter::FormatUrl[abi:cxx11](GURL const&, unsigned int, unsigned int, url::Parsed*, unsigned long*, unsigned long*) () #6 0x00005555595eef29 in HistoryURLProvider::SuggestExactInput(AutocompleteInput const&, GURL const&, bool) () #7 0x00005555595f5e74 in HistoryURLProvider::Start(AutocompleteInput const&, bool) () #8 0x000055555aea3264 in AutocompleteController::Start(AutocompleteInput const&) () #9 0x000055555aea0f3c in AutocompleteClassifier::Classify(std::__cxx11::basic_string<unsigned short, base::string16_char_traits, std::allocator<unsigned short> > const&, bool, bool, metrics::OmniboxEventProto_PageClassification, AutocompleteMatch*, GURL*) () #10 0x000055555aeb9b67 in SearchProvider::ScoreHistoryResultsHelper(std::vector<history::KeywordSearchTermVisit, std::allocator<history::KeywordSearchTermVisit> > const&, bool, bool, std::__cxx11::basic_string<unsigned short, base::string16_char_traits, std::allocator<unsigned short> > const&, bool) () #11 0x000055555aeb9ef4 in SearchProvider::ScoreHistoryResults(std::vector<history::KeywordSearchTermVisit, std::allocator<history::KeywordSearchTermVisit> > const&, bool, std::vector<SearchSuggestionParser::SuggestResult, std::allocator<SearchSuggestionParser::SuggestResult> >*) () #12 0x000055555aeba703 in SearchProvider::Start(AutocompleteInput const&, bool) () #13 0x000055555aea3264 in AutocompleteController::Start(AutocompleteInput const&) () #14 0x000055555aeac4bc in OmniboxEditModel::StartAutocomplete(bool, bool) () #15 0x000055555aeac8bc in OmniboxEditModel::OnAfterPossibleChange(OmniboxView::StateChanges const&, bool) () #16 0x000055555971e984 in OmniboxViewViews::OnAfterPossibleChange(bool) () #17 0x0000555558d89f03 in views::Textfield::DoInsertChar(unsigned short) () #18 0x0000555558d87607 in views::Textfield::InsertChar(ui::KeyEvent const&) () #19 0x000055555aa164fd in ui::InputMethodAuraLinux::ProcessKeyEventDone(ui::KeyEvent*, bool, bool) [clone .part.55] [clone .constprop.61] () #20 0x000055555aa16730 in ui::InputMethodAuraLinux::DispatchKeyEvent(ui::KeyEvent*) () #21 0x000055555899c197 in aura::WindowEventDispatcher::PreDispatchKeyEvent(ui::KeyEvent*) () #22 0x000055555899cdc3 in aura::WindowEventDispatcher::PreDispatchEvent(ui::EventTarget*, ui::Event*) () #23 0x0000555557f9f85b in ui::EventDispatcherDelegate::DispatchEvent(ui::EventTarget*, ui::Event*) () #24 0x000055555a820a7e in ui::EventProcessor::OnEventFromSource(ui::Event*) () #25 0x000055555a820ceb in ui::EventSource::DeliverEventToSink(ui::Event*) () #26 0x000055555a821256 in ui::EventSource::SendEventToSink(ui::Event*) () #27 0x0000555558dabc28 in views::DesktopWindowTreeHostX11::DispatchKeyEvent(ui::KeyEvent*) () #28 0x0000555558db0870 in views::DesktopWindowTreeHostX11::DispatchEvent(_XEvent* const&) () #29 0x0000555557f94b3c in ui::PlatformEventSource::DispatchEvent(_XEvent*) () #30 0x00005555582fcc81 in ui::X11EventSource::ExtractCookieDataDispatchEvent(_XEvent*) () #31 0x00005555582fcd6d in ui::X11EventSource::DispatchXEvents() () #32 0x000055555a83554c in ui::(anonymous namespace)::XSourceDispatch(_GSource*, int (*)(void*), void*) () #33 0x00007ffff6e393ba in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0 #34 0x00007ffff6e39738 in g_main_context_iterate.isra () from /usr/lib64/libglib-2.0.so.0 #35 0x00007ffff6e397dc in g_main_context_iteration () from /usr/lib64/libglib-2.0.so.0 #36 0x0000555557a5ec4a in base::MessagePumpGlib::Run(base::MessagePump::Delegate*) () #37 0x0000555557a83c0a in base::RunLoop::Run() () #38 0x000055555791bae9 in ChromeBrowserMainParts::MainMessageLoopRun(int*) () #39 0x00005555566fb75a in content::BrowserMainLoop::RunMainMessageLoopParts() () #40 0x00005555566fdfad in content::BrowserMainRunnerImpl::Run() () #41 0x00005555566f6d11 in content::BrowserMain(content::MainFunctionParams const&) () #42 0x00005555576de9d8 in content::ContentMainRunnerImpl::Run() () #43 0x00005555576ebc96 in service_manager::Main(service_manager::MainParams const&) () ---Type <return> to continue, or q <return> to quit--- #44 0x00005555576dd5c1 in content::ContentMain(content::ContentMainParams const&) () #45 0x00005555560f72c3 in ChromeMain () #46 0x00007fffef3ba610 in __libc_start_main () from /lib64/libc.so.6 #47 0x00005555560f7139 in _start ()
,
Oct 9 2017
Unable to reproduce the issue on Ubuntu 14.04 using chrome version 61.0.3163.100. No crash was observed after typing c in the address bar of chrome. @brule.herman: Request you to do a system restart, relaunch chrome and try again. Please update us with your observations. Also request you to please help us with a crash ID from chrome://crashes pages. This will help us to triage the issue better. Thanks.!
,
Oct 9 2017
It's do with www-client/chromium system-icu under gentoo, since chromium 57 to 61, same after system restart, chrome://crashes -> Crash reporting is disabled. ICU used under gentoo: dev-libs/icu-58.2-r1
,
Oct 9 2017
Thank you for providing more feedback. Adding requester "ranjitkan@chromium.org" to the cc list and removing "Needs-Feedback" label. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Oct 10 2017
Could some one from MTV team take a look into this issue, as the mentioned hardware is not available with India team. Thanks.!
,
Oct 10 2017
,
Nov 7 2017
Also reproducible on Arch Linux with Chromium 62.0.3202.89. Steps to reproduce: 1) Build Chromium against system ICU 59.1 2) Run `LANG=C chromium` (regular locales like en_US.UTF-8 do NOT exhibit this issue) 3) Type "α" into the address bar The crash occurs in components/url_formatter/idn_spoof_checker.cc. The call to icu::Transliterator::createFromRules() fails with U_RULE_MASK_ERROR which leaves transliterator_ set to NULL, causing the transliterator_.get()->transliterate() call to trigger a SIGSEGV later on. The issue is NOT reproducible with Chrome or Chromium with bundled ICU. It only occurs when Chromium is using the system ICU (crash confirmed with version 59.1). I'm not sure where to look next, perhaps at the patches that Chromium applies to its own ICU?
,
Nov 8 2017
I figured out why the bundled ICU works fine; it's because it's built with U_CHARSET_IS_UTF8=1. If I build the system ICU with CPPFLAGS+=' -DU_CHARSET_IS_UTF8=1' then it works fine too. Back to the regular system ICU though, the parse error shows that these rules mask each other: preContext = u"\xfffd\xfffd > l;\000\000\000\000\000\000\000\000" postContext = u"\xfffd\xfffd > o;\000\000\000\000\000\000\000\000" These appears to originate from the "ł > l; ø > o;" rules, with each byte substituted by a 0xFFFD replacement character. (Other unicode characters in the rule list are similarly changed to a bunch of 0xFFFD characters.) Not sure what needs to be fixed here but it seems to be a bug in Chromium (not playing nice with system ICU and non-UTF-8 locales).
,
Nov 9 2017
+jshin ptal at c#7 and c#8
,
Dec 13 2017
,
Dec 15 2017
This happens because Chrome's copy of ICU is built with UTF-8 as the encoding of what's stored in |char []|, but apparently ArchLinux and Gentoo does not.
,
Dec 15 2017
Oh. I missed comment 8. Yes, that's what's happening. I'd argue that Arch Linux and Gentoo should set U_CHARSET_IS_UTF8=1, too. There's no point of not turning it on on modern Linux where all the locales are UTF-8. I'll file a bug against ICU that at least on Linux, U_CHARSET_IS_UTF8 should be '1' by default.
,
Dec 15 2017
> Run `LANG=C chromium` (regular locales like en_US.UTF-8 do NOT exhibit this issue) This bug is even less important because ordinary users wouldn't use C locale. I can fix it by explicitly specifying charset, though, with a bit more cluttering in the code.
,
Dec 16 2017
https://chromium-review.googlesource.com/c/chromium/src/+/831247 is a CL.
,
Dec 16 2017
https://ssl.icu-project.org/trac/ticket/13519 : a bug filed to turn on U_CHARSET_IS_UTF8 by default on Linux.
,
Dec 16 2017
,
Dec 16 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/e58fa0ba66272c5f28828b15d06c7e42a9882b3b commit e58fa0ba66272c5f28828b15d06c7e42a9882b3b Author: Jungshik Shin <jshin@chromium.org> Date: Sat Dec 16 04:19:27 2017 Use fromUTF8() for UnicodeString construction from UTF-8 Chrome's copy of ICU is built with U_CHARSET_IS_UTF8=1 so that |char *| buffer is treated as UTF-8 when constructing UnicodeString() regardless of the default encoding of the current locale on Linux or non-Unicode code page on Windows. However, some Linux distros do not set U_CHARSET_IS_UTF=1 when building ICU and Chromium build with system_icu crashes when Chromium is run in non-UTF-8 locale (e.g. 'C'). To make Chromium work in a non-UTF-8 locale (which is pretty rare these days), use 'icu::UnicodeString::fromUTF8(StringPiece)' instead of 'icu::UnicodeString(const char*)'. Bug: 772655 Test: components_unittests --gtest_filter=*IDN* Test: Chromium built with system_icu does not crash in C locale. Change-Id: I0daa284ec06b8e83814fc70eb8e9e5c96444ebfa Reviewed-on: https://chromium-review.googlesource.com/831247 Reviewed-by: Peter Kasting <pkasting@chromium.org> Commit-Queue: Jungshik Shin <jshin@chromium.org> Cr-Commit-Position: refs/heads/master@{#524586} [modify] https://crrev.com/e58fa0ba66272c5f28828b15d06c7e42a9882b3b/components/url_formatter/idn_spoof_checker.cc
,
Dec 16 2017
Fixed in trunk. I don't think it's worth merging to branches because there's little reason for end-users to run Chromium in a non-UTF-8 locale. Linux distros building Chromium with system_icu can cherry-pick the CL recorded in the previous comment or build ICU with U_CHARSET_IS_UTF8=1.
,
Dec 16 2017
My only concern would be other icu::UnicodeString() calls which might be affected by this. A quick grep doesn't show anything obvious (to me) though. I agree about not merging to branches; Chrome itself isn't affected, only some downstream Chromium builds. However, I'm not sure if we can guarantee a UTF-8 locale on Linux, mainly due to possible misconfiguration on the user's part. If C.UTF-8 makes it to glibc upstream, and there is some way to always enforce a UTF-8 locale, UTF-8 would indeed be guaranteed. I could also be misunderstanding how U_CHARSET_IS_UTF8=1 works and perhaps it makes a good choice and works fine even with the C locale. Or would stuff like "icu::UnicodeString(argv[1])" be broken when argv[1] contains non-UTF-8 encoded non-ASCII characters?
,
Dec 21 2017
> My only concern would be other icu::UnicodeString() calls which might be affected by this. A quick grep doesn't show anything obvious (to me) though. I did that quick grep, too. :-) I didn't find any. If I hand found any, I'd have included in the CL. > would stuff like "icu::UnicodeString(argv[1])" be broken when argv[1] contains non-UTF-8 encoded non-ASCII characters? Yeah... It'd break. I used that construct because I know I turned on U_CHARSET_IS_UTF8=1 and Chrome does not pass a command line flag/filename to UnicodeString(). |
|||||||||||
►
Sign in to add a comment |
|||||||||||
Comment 1 by nyerramilli@chromium.org
, Oct 9 2017