Issue metadata
Sign in to add a comment
|
XML validation
Reported by
vkatsika...@gmail.com,
May 25 2016
|
||||||||||||||||||||||||
Issue description
Chrome Version : Version 50.0.2661.102 (64-bit)
URLs (if applicable) : not applicable
Other browsers tested:
Safari: not tested
Firefox: not tested
IE: not tested
What steps will reproduce the problem?
(1) open local file
What is the expected result?
Open a file and show it linted in XML.
What happens instead?
============
This page contains the following errors:
error on line 18579 at column 25: Input is not proper UTF-8, indicate encoding !
Bytes: 0xE2 0x80 0xA2 0x20
Below is a rendering of the page up to the first error.
============
$ md5sum file.xml
6b0938f815c1e86e5b5f4e29b6fe9040 file.xml
$ xmllint --noout file.xml
I inspected the file manually and indeed it looks like valid unicode
============
$ head -n1 file.xml
<?xml version="1.0" encoding="UTF-8" ?>
$ grep -b 'ontwikkelen van maatwerk-applicaties' file.xml
2353958: <description><![CDATA[<p>• ontwikkelen van maatwerk-applicaties in php e
...
$ hexdump -s 2353958 -n 120 -v -e '"|" 32/1 "%_u" "|\n"' file.xml
| <description><![CDATA[<p>e280a2 o|
|ntwikkelen van maatwerk-applicat|
|ies in php en nodejs<br />e280a2 im|
|plementeren van enterpri|
============
0xe2 0x80 0xa2 looks like valid unicode for UTF8 ('BULLET' (U+2022)). However chrome in its error says:
Bytes: 0xE2 0x80 0xA2 0x20 (note the 'SPACE' (U+0020) after the BULLET)
In the bug reporting system, there is a max attachement file of 10M. I tried to create a minimal working example but I failed (I attach two tests that work OK - they do NOT show the problem). The bug is reproduced only by a 90M xml file (I understand why my claim is frustrating :( ). I can send the file if needed in another way.
,
Jun 10 2016
,
Jun 13 2016
,
Jul 22 2016
Thanks for the details provided. If possible can you please upload the test case in a drive and provide share access to (ranjitkan@chromium.org). This will help us to triage the issue. Thanks.!
,
Jul 29 2016
Assign kojii@ for confirmation.
,
Jul 29 2016
Thanks for looking into this. I sent an email with the test case to everyone mentioned on the Issue.
,
Jul 29 2016
PS the same persists on 51.0.2704.84 and 52.0.2743.82
,
Jul 29 2016
This could be the same as issue 612430 , which was fixed in M53. Could you try beta?
,
Jul 29 2016
I tried 53.0.2785.34 beta. The 100M xml takes forever to load so I can't validate that it parsed the whole thing (unless there a way to see this from the threads' arguments in top ?). However, previously it was failing quite fast, so I would guess it parses past the point it was failing.
,
Jul 30 2016
Thank you for the verification. I feel sorry for Chrome not being a good tool to browser 100MB xml, but glad to hear we're one step forward. |
|||||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||||
Comment 1 by ashej...@chromium.org
, May 25 2016