New issue
Advanced search Search tips

Issue 619421 link

Starred by 4 users

Issue metadata

Status: Duplicate
Merged: issue v8:287
Owner:
Closed: Nov 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 2
Type: Bug

Blocking:
issue v8:6735


Show other hotlists

Hotlists containing this issue:
Hotlist-1


Sign in to add a comment

[regexp] RegExp with exponential runtime on blog.chromium.org

Reported by jleedev@gmail.com, Jun 12 2016

Issue description

Chrome Version       : 53.0.2750.0 (Developer Build) (64-bit)
Operating System and Version: OS X, Chrome OS, Node.
URLs (if applicable) : http://blog.chromium.org/2016_06_01_archive.html

Description of performance problem:

1. Load http://blog.chromium.org/2016_06_01_archive.html
2. Observe 100% CPU usage in BreakpointHandler.finalizeSummary

I can reproduce this in Firefox, but I cannot reproduce this in Safari.


Remember to attach your trace file to this bug!

Attached. Also attached is a stripped down copy of blog.chromium.org that still exhibits the issue, as well as a single regexp call; pasted below but also attached since it has significant whitespace.


s="\n\n                     \n                     <ul>\n                      <li>\n                        <div>\n                          <span>Sites</span>\n                          <a><span>can now experiment</span></a>\n                          <span>with</span>\n                          <a><span>persistent storage</span></a>\n                          <span>as an</span>\n                          <span>, allowing a site to disable automatic storage clearing when bookmarked.</span>\n                        </div>\n                      </li><li>\n                        <div>\n                          <span>Multiple WebVTT tracks will now be presented as</span>\n                          <a><span>user options</span></a> \n                        </div>\n                    </li></ul>";
t=s.replace(/(<br>|\s+)*$/,'');

 
trace_Sun_Jun_12_2016_4.07.38_PM.json.gz
6.1 MB Download
index.html
4.0 KB View Download
s.js
840 bytes View Download

Comment 1 by jleedev@gmail.com, Jun 16 2016

It appears the regexp on the blog has been fixed:

        summaryHtml = $.trim(summaryHtml).replace(/(<br>|\s)+$/,'');

Nonetheless, Safari appears to optimize the regexp in this example in a way which Chrome does not, which may still be interesting.
Components: Blink>JavaScript
Setting component Blink>Javascript for V8 team to triage.
Cc: yangguo@chromium.org
Status: Available (was: Unconfirmed)
Feel free to close it if this is not something actionable.

Comment 4 by jleedev@gmail.com, Jul 20 2016

This is back, on googlechromereleases now.
Chrome Releases.html
245 KB View Download

Comment 5 by pdk...@gmail.com, Jul 20 2016

Confirmed. The page takes 100% CPU and doesn't render.

http://googlechromereleases.blogspot.com/
Here are some more data-points:

https://jsperf.com/pathological-regexp - Safari is 300million times faster than Chrome recognizing a tiny, 30character string.

Some googling suggests this may be the problem:
https://swtch.com/~rsc/regexp/regexp1.html
In case jsperf is down - which it is sometimes these days, here is the repro javascript:

var 
  regexp10 = /a?a?a?a?a?a?a?a?a?a?aaaaaaaaaa/,
  string10 = "aaaaaaaaaa",
  regexp20 = /a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?aaaaaaaaaaaaaaaaaaaa/,
  string20 = "aaaaaaaaaaaaaaaaaaaa",
  regexp30 = /a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/,
  string30 = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
;

# Tests:
regexp10.test(string10);
regexp20.test(string20);  
regexp30.test(string30);

# results:
# Chrome 57.0.2987.133 (64-bit):
# regexp10: 35,8914 / second
# regexp20: 110 / second
# regexp30: 0.11 / second
#
# Safari 10.1 (12603.1.30.0.34):
# regexp10: 50,098,582 / second
# regexp20: 40,519,142 / second
# regexp30: 32,515,791 / second

Blocking: v8:6735
Labels: OS-All
Owner: jgruber@chromium.org
Status: Assigned (was: Available)
Summary: [regexp] RegExp with exponential runtime on blog.chromium.org (was: Performance issue: Very slow RegExp on blog.chromium.org)
Labels: -Performance Performance-Power
Mergedinto: 287
Status: Duplicate (was: Assigned)
Exponential backtracking again. A sequence of n 'a?' in the given patterns in #7 will result in 2^n options that the regexp engine needs to try. A better solution would be /a{0,30}aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/.

The pattern from #4 and #0 is a bit different but also results in exponential behavior. A brief repro there is:

/(<br>|\s+)*$/.exec("                                a")

Here, I'd suggest slightly changing the pattern to 

/(<br>|\s)*$/

to remove the nested repetition.
To be clear, the "aaa" example was an example. Here is what I ran into in the real world:

/^([^,:]|\(.*\)|\{.*\}|\".*\"|\'.*\'|\[.*\])*$/.test('{"mappings":{"post_search":{"_parent":{"type":"topic"},"properties":{"topicId":{"type":"keyword"},"userId":{"type":"keyword"},"postOrder":{"type":"long"},"createdAt":{"type":"long"},"updatedAt":{"type":"long"},"isChapterStart":{"type":"boolean"},"caption":{"type":"text","analyzer":"english"},"tags":{"type":"text"},"mentions":{"type":"text"},"mediaMimeType":{"type":"keyword"},"mediaAspectRatio":{"type":"short"},"mediaUrl":{"type":"keyword","index":false},"mediaFocus":{"type":"object","index":false},"mediaDimensions":{"type":"object","index":false},"mediaColorInfo":{"type":"object","index":false},"templateUrl":{"type":"keyword","index":false},"templateType":{"type":"keyword"},"templateDropInCount":{"type":"byte"},"templateText":{"type":"text","analyzer":"english"},"templateDropInLocations":{"type":"nested","index":false},"templateUses":{"type":"integer"},"activityCount":{"type":"integer"},"lastActivityAt":{"type":"long"},"messageCount":{"type":"integer","index":false},"participantCount":{"type":"integer","index":false},"lastActiveUserId":{"type":"keyword","index":false},"lastMessageId":{"type":"keyword","index":false}}},"topic_search":{"_parent":{"type":"user"},"properties":{"title":{"type":"text","analyzer":"english"},"createdAt":{"type":"long"},"updatedAt":{"type":"long"},"lastPostCreatedAt":{"type":"long"},"lastPostId":{"type":"keyword"},"lastChapterPostId":{"type":"keyword"},"postCount":{"type":"integer"},"followerCount":{"type":"integer"},"activityCount":{"type":"long"},"messageCount":{"type":"long"},"isProfileTopic":{"type":"boolean"}}},"user_search":{"properties":{"displayName":{"type":"text","analyzer":"english"},"postCount":{"type":"integer"},"topicCount":{"type":"short"},"followerCount":{"type":"integer"},"messageCount":{"type":"integer"},"lastTopicCreatedAt":{"type":"long"},"lastPostCreatedAt":{"type":"long"},"profileTopicId":{"type":"keyword","index":false}}}},"settings":{}}');
#11 seems similar to /(<br>|\s+)*$/ as described in #10, A repeated alteration with nested repetitions.
Evidently there is a better way to implement RegExp which avoids these problems. There is a well-written article on this here: https://swtch.com/~rsc/regexp/regexp1.html

Did I mention Safari is 300 MILLION times faster than Chrome on this? 300 million?

Granted this doesn't effect most RegExp, BUT, it does effect real-world examples. I was shocked when my above example brought chrome to its knees where Safari breezed through it without a hiccup.



Last, this is not a duplicated of 287: input box rendering error : Facebook. 

https://bugs.chromium.org/p/chromium/issues/detail?id=287 has nothing to do with RegExp.

Can we unmerge this?
Mergedinto: -287 v8:287
Thanks, that should've been v8:287. See merged bugs there for rationale behind marking exponential backtracking bugs as dupes for now.

Sign in to add a comment