New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 772655 link

Starred by 4 users

Issue metadata

Status: Fixed
Owner:
Closed: Dec 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: Linux
Pri: 3
Type: Bug


Show other hotlists

Hotlists containing this issue:
Chromium-Packagers


Sign in to add a comment

Chromium with system_icu n C locale on Gentoo/ArchLinux, crash in url_formatter::IDNSpoofChecker::SimilarToTopDomains

Reported by brule.he...@gmail.com, Oct 7 2017

Issue description

Chrome Version       : 61.0.3163.100
OS Version: Linux Gentoo

What steps will reproduce the problem?
1. Just press C into the URL bar


What is the expected result?
Not crash

What happens instead of that?
Crash

Please provide any additional information below. Attach a screenshot if
possible.

UserAgentString: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36


Thread 1 "chrome" received signal SIGSEGV, Segmentation fault.
0x000055555838403f in url_formatter::IDNSpoofChecker::SimilarToTopDomains(base::BasicStringPiece<std::__cxx11::basic_string<unsigned short, base::string16_char_traits, std::allocator<unsigned short> > >) ()
(gdb) bt
#0  0x000055555838403f in url_formatter::hro::SimilarToTopDomains(base::BasicStringPiece<std::__cxx11::basic_string<unsigned short, base::string16_char_traits, std::allocator<unsigned short> > >) ()
#1  0x0000555558381ee2 in url_formatter::(anonymous namespace)::IDNToUnicodeWithAdjustments(base::BasicStringPiece<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::vector<base::OffsetAdjuster::Adjustment, std::allocator<base::OffsetAdjuster::Adjustment> >*) ()
#2  0x000055555838299e in url_formatter::(anonymous namespace)::HostComponentTransform::Execute(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<base::OffsetAdjuster::Adjustment, std::allocator<base::OffsetAdjuster::Adjustment> >*) const ()
#3  0x00005555583816de in url_formatter::(anonymous namespace)::AppendFormattedComponent(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, url::Component const&, url_formatter::(anonymous namespace)::AppendComponentTransform const&, std::__cxx11::basic_string<unsigned short, base::string16_char_traits, std::allocator<unsigned short> >*, url::Component*, std::vector<base::OffsetAdjuster::Adjustment, std::allocator<base::OffsetAdjuster::Adjustment> >*) ()
#4  0x0000555558382efd in url_formatter::FormatUrlWithAdjustments[abi:cxx11](GURL const&, unsigned int, unsigned int, url::Parsed*, unsigned long*, std::vector<base::OffsetAdjuster::Adjustment, std::allocator<base::OffsetAdjuster::Adjustment> >*) ()
#5  0x0000555558383d16 in url_formatter::FormatUrl[abi:cxx11](GURL const&, unsigned int, unsigned int, url::Parsed*, unsigned long*, unsigned long*) ()
#6  0x00005555595eef29 in HistoryURLProvider::SuggestExactInput(AutocompleteInput const&, GURL const&, bool) ()
#7  0x00005555595f5e74 in HistoryURLProvider::Start(AutocompleteInput const&, bool) ()
#8  0x000055555aea3264 in AutocompleteController::Start(AutocompleteInput const&) ()
#9  0x000055555aea0f3c in AutocompleteClassifier::Classify(std::__cxx11::basic_string<unsigned short, base::string16_char_traits, std::allocator<unsigned short> > const&, bool, bool, metrics::OmniboxEventProto_PageClassification, AutocompleteMatch*, GURL*) ()
#10 0x000055555aeb9b67 in SearchProvider::ScoreHistoryResultsHelper(std::vector<history::KeywordSearchTermVisit, std::allocator<history::KeywordSearchTermVisit> > const&, bool, bool, std::__cxx11::basic_string<unsigned short, base::string16_char_traits, std::allocator<unsigned short> > const&, bool) ()
#11 0x000055555aeb9ef4 in SearchProvider::ScoreHistoryResults(std::vector<history::KeywordSearchTermVisit, std::allocator<history::KeywordSearchTermVisit> > const&, bool, std::vector<SearchSuggestionParser::SuggestResult, std::allocator<SearchSuggestionParser::SuggestResult> >*) ()
#12 0x000055555aeba703 in SearchProvider::Start(AutocompleteInput const&, bool) ()
#13 0x000055555aea3264 in AutocompleteController::Start(AutocompleteInput const&) ()
#14 0x000055555aeac4bc in OmniboxEditModel::StartAutocomplete(bool, bool) ()
#15 0x000055555aeac8bc in OmniboxEditModel::OnAfterPossibleChange(OmniboxView::StateChanges const&, bool) ()
#16 0x000055555971e984 in OmniboxViewViews::OnAfterPossibleChange(bool) ()
#17 0x0000555558d89f03 in views::Textfield::DoInsertChar(unsigned short) ()
#18 0x0000555558d87607 in views::Textfield::InsertChar(ui::KeyEvent const&) ()
#19 0x000055555aa164fd in ui::InputMethodAuraLinux::ProcessKeyEventDone(ui::KeyEvent*, bool, bool) [clone .part.55] [clone .constprop.61] ()
#20 0x000055555aa16730 in ui::InputMethodAuraLinux::DispatchKeyEvent(ui::KeyEvent*) ()
#21 0x000055555899c197 in aura::WindowEventDispatcher::PreDispatchKeyEvent(ui::KeyEvent*) ()
#22 0x000055555899cdc3 in aura::WindowEventDispatcher::PreDispatchEvent(ui::EventTarget*, ui::Event*) ()
#23 0x0000555557f9f85b in ui::EventDispatcherDelegate::DispatchEvent(ui::EventTarget*, ui::Event*) ()
#24 0x000055555a820a7e in ui::EventProcessor::OnEventFromSource(ui::Event*) ()
#25 0x000055555a820ceb in ui::EventSource::DeliverEventToSink(ui::Event*) ()
#26 0x000055555a821256 in ui::EventSource::SendEventToSink(ui::Event*) ()
#27 0x0000555558dabc28 in views::DesktopWindowTreeHostX11::DispatchKeyEvent(ui::KeyEvent*) ()
#28 0x0000555558db0870 in views::DesktopWindowTreeHostX11::DispatchEvent(_XEvent* const&) ()
#29 0x0000555557f94b3c in ui::PlatformEventSource::DispatchEvent(_XEvent*) ()
#30 0x00005555582fcc81 in ui::X11EventSource::ExtractCookieDataDispatchEvent(_XEvent*) ()
#31 0x00005555582fcd6d in ui::X11EventSource::DispatchXEvents() ()
#32 0x000055555a83554c in ui::(anonymous namespace)::XSourceDispatch(_GSource*, int (*)(void*), void*) ()
#33 0x00007ffff6e393ba in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0
#34 0x00007ffff6e39738 in g_main_context_iterate.isra () from /usr/lib64/libglib-2.0.so.0
#35 0x00007ffff6e397dc in g_main_context_iteration () from /usr/lib64/libglib-2.0.so.0
#36 0x0000555557a5ec4a in base::MessagePumpGlib::Run(base::MessagePump::Delegate*) ()
#37 0x0000555557a83c0a in base::RunLoop::Run() ()
#38 0x000055555791bae9 in ChromeBrowserMainParts::MainMessageLoopRun(int*) ()
#39 0x00005555566fb75a in content::BrowserMainLoop::RunMainMessageLoopParts() ()
#40 0x00005555566fdfad in content::BrowserMainRunnerImpl::Run() ()
#41 0x00005555566f6d11 in content::BrowserMain(content::MainFunctionParams const&) ()
#42 0x00005555576de9d8 in content::ContentMainRunnerImpl::Run() ()
#43 0x00005555576ebc96 in service_manager::Main(service_manager::MainParams const&) ()
---Type <return> to continue, or q <return> to quit---
#44 0x00005555576dd5c1 in content::ContentMain(content::ContentMainParams const&) ()
#45 0x00005555560f72c3 in ChromeMain ()
#46 0x00007fffef3ba610 in __libc_start_main () from /lib64/libc.so.6
#47 0x00005555560f7139 in _start ()
 
crash.txt
6.0 KB View Download
Labels: Needs-Triage-M61
Cc: ranjitkan@chromium.org
Labels: Needs-Feedback
Unable to reproduce the issue on Ubuntu 14.04 using chrome version 61.0.3163.100. No crash was observed after typing c in the address bar of chrome.

@brule.herman: Request you to do a system restart, relaunch chrome and try again. Please update us with your observations. Also request you to please help us with a crash ID from chrome://crashes pages. This will help us to triage the issue better.

Thanks.!
It's do with www-client/chromium system-icu under gentoo, since chromium 57 to 61, same after system restart, chrome://crashes -> Crash reporting is disabled. ICU used under gentoo: dev-libs/icu-58.2-r1
Project Member

Comment 4 by sheriffbot@chromium.org, Oct 9 2017

Labels: -Needs-Feedback
Thank you for providing more feedback. Adding requester "ranjitkan@chromium.org" to the cc list and removing "Needs-Feedback" label.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Cc: pbomm...@chromium.org
Labels: TE-NeedsTriageFromMTV
Could some one from MTV team take a look into this issue, as the mentioned hardware is not available with India team.

Thanks.!
Cc: thomasanderson@chromium.org
Also reproducible on Arch Linux with Chromium 62.0.3202.89.

Steps to reproduce:

1) Build Chromium against system ICU 59.1
2) Run `LANG=C chromium` (regular locales like en_US.UTF-8 do NOT exhibit this issue)
3) Type "α" into the address bar

The crash occurs in components/url_formatter/idn_spoof_checker.cc. The call to icu::Transliterator::createFromRules() fails with U_RULE_MASK_ERROR which leaves transliterator_ set to NULL, causing the transliterator_.get()->transliterate() call to trigger a SIGSEGV later on.

The issue is NOT reproducible with Chrome or Chromium with bundled ICU. It only occurs when Chromium is using the system ICU (crash confirmed with version 59.1).

I'm not sure where to look next, perhaps at the patches that Chromium applies to its own ICU?
I figured out why the bundled ICU works fine; it's because it's built with U_CHARSET_IS_UTF8=1. If I build the system ICU with CPPFLAGS+=' -DU_CHARSET_IS_UTF8=1' then it works fine too.

Back to the regular system ICU though, the parse error shows that these rules mask each other:

  preContext = u"\xfffd\xfffd > l;\000\000\000\000\000\000\000\000"
  postContext = u"\xfffd\xfffd > o;\000\000\000\000\000\000\000\000"

These appears to originate from the "ł > l; ø > o;" rules, with each byte substituted by a 0xFFFD replacement character. (Other unicode characters in the rule list are similarly changed to a bunch of 0xFFFD characters.)

Not sure what needs to be fixed here but it seems to be a bug in Chromium (not playing nice with system ICU and non-UTF-8 locales).
Cc: js...@chromium.org
+jshin ptal at c#7 and c#8
Owner: js...@chromium.org
Status: Assigned (was: Unconfirmed)

Comment 11 by js...@chromium.org, Dec 15 2017

This happens because Chrome's copy of ICU is built with UTF-8 as the encoding of what's stored in |char []|, but apparently ArchLinux and Gentoo does not. 



Comment 12 by js...@chromium.org, Dec 15 2017

Oh. I missed comment 8. Yes, that's what's happening. 

I'd argue that Arch Linux and Gentoo should set U_CHARSET_IS_UTF8=1, too. There's no point of not turning it on on modern Linux where all the locales are UTF-8. 

I'll file a bug against ICU that at least on Linux, U_CHARSET_IS_UTF8 should be '1' by default. 



Comment 13 by js...@chromium.org, Dec 15 2017

Summary: Chromium with system_icu n C locale on Gentoo/ArchLinux, crash in url_formatter::IDNSpoofChecker::SimilarToTopDomains (was: Crash into url_formatter::IDNSpoofChecker::SimilarToTopDomains)
> Run `LANG=C chromium` (regular locales like en_US.UTF-8 do NOT exhibit this issue)

This bug is even less important because ordinary users wouldn't use C locale. 

I can fix it by explicitly specifying charset, though, with a bit more cluttering in the code. 

Comment 14 by js...@chromium.org, Dec 16 2017

Status: Started (was: Assigned)
https://chromium-review.googlesource.com/c/chromium/src/+/831247 is a CL. 


Comment 15 by js...@chromium.org, Dec 16 2017

https://ssl.icu-project.org/trac/ticket/13519 : a bug filed to turn on U_CHARSET_IS_UTF8 by default on Linux. 

Comment 16 by js...@chromium.org, Dec 16 2017

Labels: -Needs-Triage-M61
Project Member

Comment 17 by bugdroid1@chromium.org, Dec 16 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e58fa0ba66272c5f28828b15d06c7e42a9882b3b

commit e58fa0ba66272c5f28828b15d06c7e42a9882b3b
Author: Jungshik Shin <jshin@chromium.org>
Date: Sat Dec 16 04:19:27 2017

Use fromUTF8() for UnicodeString construction from UTF-8

Chrome's copy of ICU is built with U_CHARSET_IS_UTF8=1 so that |char *|
buffer is treated as UTF-8 when constructing UnicodeString() regardless
of the default encoding of the current locale on Linux or non-Unicode code
page on Windows.

However, some Linux distros do not set U_CHARSET_IS_UTF=1 when building
ICU and Chromium build with system_icu crashes when Chromium is run in
non-UTF-8 locale (e.g. 'C').

To make Chromium work in a non-UTF-8 locale (which is pretty rare these
days), use 'icu::UnicodeString::fromUTF8(StringPiece)' instead of
'icu::UnicodeString(const char*)'.

Bug:  772655 
Test: components_unittests --gtest_filter=*IDN*
Test: Chromium built with system_icu does not crash in C locale.
Change-Id: I0daa284ec06b8e83814fc70eb8e9e5c96444ebfa
Reviewed-on: https://chromium-review.googlesource.com/831247
Reviewed-by: Peter Kasting <pkasting@chromium.org>
Commit-Queue: Jungshik Shin <jshin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#524586}
[modify] https://crrev.com/e58fa0ba66272c5f28828b15d06c7e42a9882b3b/components/url_formatter/idn_spoof_checker.cc

Comment 18 by js...@chromium.org, Dec 16 2017

Labels: -TE-NeedsTriageFromMTV
Status: Fixed (was: Started)
Fixed in trunk.  I don't think it's worth merging to branches because there's little reason for end-users to run Chromium in a non-UTF-8 locale.

Linux distros building Chromium with system_icu can cherry-pick the CL recorded in the previous comment  or build ICU with U_CHARSET_IS_UTF8=1. 
My only concern would be other icu::UnicodeString() calls which might be affected by this. A quick grep doesn't show anything obvious (to me) though.

I agree about not merging to branches; Chrome itself isn't affected, only some downstream Chromium builds. However, I'm not sure if we can guarantee a UTF-8 locale on Linux, mainly due to possible misconfiguration on the user's part. If C.UTF-8 makes it to glibc upstream, and there is some way to always enforce a UTF-8 locale, UTF-8 would indeed be guaranteed.

I could also be misunderstanding how U_CHARSET_IS_UTF8=1 works and perhaps it makes a good choice and works fine even with the C locale. Or would stuff like "icu::UnicodeString(argv[1])" be broken when argv[1] contains non-UTF-8 encoded non-ASCII characters?

Comment 20 by js...@chromium.org, Dec 21 2017

> My only concern would be other icu::UnicodeString() calls which might be affected by this. A quick grep doesn't show anything obvious (to me) though.

I did that quick grep, too. :-) I didn't find any. If I hand found any, I'd have included in the CL. 

>  would stuff like "icu::UnicodeString(argv[1])" be broken when argv[1] contains non-UTF-8 encoded non-ASCII characters?

Yeah... It'd break.  I used that construct because I know I turned on U_CHARSET_IS_UTF8=1 and Chrome does not pass a command line flag/filename to UnicodeString(). 

Sign in to add a comment