RegExp: /[\c%]/ is not yet handled by the spec |
||||
Issue description
/[\c%]/ doesn't match the RegExp grammar in Annex B and should throw a SyntaxError. The same holds for any '\cX' within a character class where X is not in {0-9,_,a-z,A-Z}.
An attempted derivation:
CharacterClass (https://tc39.github.io/ecma262/#prod-CharacterClass)
-> ClassRanges
-> NonemptyClassRanges
-> ClassAtom
-> ClassAtomNoDash
-> ClassEscape (Annex B)
-> CharacterEscape
-> Nothing matches at this point.
A couple of similar but valid cases:
/[\c0]/, ..., /[\c9]/, /[\c_]/ (ClassControlLetter)
/[\ca]/, ..., /[\cZ]/ (ControlLetter)
/\c%/ (ok outside of character class)
,
Apr 4 2017
/^[\c%]*$/.test("\\c%") -->
v8: true
firefox: true
,
Apr 4 2017
Some related work was started in this patch: https://github.com/tc39/ecma262/commit/fbdfda6f2a613f3c4813d4b34e32f5c5134cf921 However, Andre may have left out this case as its interpretation differs between browsers. In particular, ChakraCore differs from the agreement between SpiderMonkey, V8 and JSC. SM, V8 and JSC will treat [\c%] as [\\c%], but ChakraCore will treat it as [\x05], taking the lower 5 bits, as for other control escapes. For a class like [\c], SM, V8 and JSC will treat it as [\\c], whereas ChakraCore will treat it as [c]. IMO it would be reasonable to standardize on the 3/4 behavior for both of these cases. ----- Boring raw tests: littledan@littledan-ThinkPad-T460p:~/v8/v8$ eshost -e '/^[\c%]$/.test("\\")' #### chakracore false #### d8 true #### jsc true #### spidermonkey true littledan@littledan-ThinkPad-T460p:~/v8/v8$ eshost -e '/[\c%]/.test("c")' #### chakracore false #### d8 true #### jsc true #### spidermonkey true littledan@littledan-ThinkPad-T460p:~/v8/v8$ eshost -e '/[\c%]/.test("\x05")' #### jsc false #### chakracore true #### d8 false #### spidermonkey false littledan@littledan-ThinkPad-T460p:~/v8/v8$ eshost -e '/^[\c]$/.test("c")' #### jsc true #### d8 true #### chakracore true #### spidermonkey true #### v8debug littledan@littledan-ThinkPad-T460p:~/v8/v8$ eshost -e '/^[\c]$/.test("\\")' #### jsc true #### d8 true #### spidermonkey true #### chakracore false
,
Apr 4 2017
,
Apr 4 2017
Just so I am understanding this right: For the case outside [] all browsers agree on the strange appendix-sanctioned interpretation where \c% matches the same as \\c% would match? For the case inside [] there is disagreement on what it means with non-Chakra matching the same as [\\c%] and Chakra matching only the single code point 5?
,
Apr 4 2017
If #5 is correct then I think Chakra has surprising behavior.
,
Apr 4 2017
Upstream bug to cross-reference: https://github.com/tc39/ecma262/issues/863
,
Apr 7 2017
The following revision refers to this bug: https://chromium.googlesource.com/v8/v8.git/+/4498419438746bf94fc6a296ccc2eb61a57e2738 commit 4498419438746bf94fc6a296ccc2eb61a57e2738 Author: jgruber <jgruber@chromium.org> Date: Fri Apr 07 07:52:10 2017 [regexp] Add tests for recent changes in Annex B See https://github.com/tc39/ecma262/pull/303. BUG= v8:5937 , v8:6201 Review-Url: https://codereview.chromium.org/2793313002 Cr-Commit-Position: refs/heads/master@{#44467} [modify] https://crrev.com/4498419438746bf94fc6a296ccc2eb61a57e2738/src/regexp/regexp-parser.cc [modify] https://crrev.com/4498419438746bf94fc6a296ccc2eb61a57e2738/test/mjsunit/regexp.js
,
Apr 7 2017
See also https://github.com/tc39/ecma262/pull/864 https://github.com/tc39/ecma262/issues/863
,
Jun 8 2017
The specification fix here got consensus at TC39; you can see a draft at https://github.com/tc39/ecma262/pull/864 (but this spec patch has to be reworded). Irregexp is working as intended! |
||||
►
Sign in to add a comment |
||||
Comment 1 by littledan@chromium.org
, Apr 4 2017This syntax is supported across browsers. I think the right fix here would be to update Annex B with whatever the real current cross-browser grammar is. In particular, I wonder if the "but not c" clause of IdentityEscape is implemented in browsers. littledan@littledan-ThinkPad-T460p:~/v8/v8$ eshost -e "/[\c%]/.exec('')" #### jsc null #### chakracore null #### d8 null #### spidermonkey null