New issue
Advanced search Search tips

Issue 852585 link

Starred by 0 users

Issue metadata

Status: Fixed
Owner:
Closed: Jan 8
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux , Chrome
Pri: 3
Type: Feature



Sign in to add a comment

minify messages.json files

Project Member Reported by vapier@chromium.org, Jun 13 2018

Issue description

on a Pixel Chromebook, we currently ship ~440 messages.json files which add up to ~21M.  these files are usually not optimally encoded:
- they use \u#### escapes instead of UTF-8
- they have lots of useless whitespace

converting to UTF-8 shaves off ~1.5MiB.  stripping the whitespace shaves off ~3.8MiB.  lets not waste space on files no one reads, especially when the source material are grd files checked in elsewhere.
 
Project Member

Comment 1 by bugdroid1@chromium.org, Jun 14 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/364c6973e1d6dd7ed27e5d33b833f401d978446f

commit 364c6973e1d6dd7ed27e5d33b833f401d978446f
Author: Mike Frysinger <vapier@chromium.org>
Date: Thu Jun 14 01:46:40 2018

grit: format output using UTF-8

Since Chrome is the only thing that loads these files, there's no need
to force them to use JSON escapes for non-ASCII Unicode codepoints.
Just use native UTF-8 encoding everywhere (especially since all the
files already have Unicode BOM's in them!).

It shouldn't make the readability of the files worse.  If anything, it
should improve it greatly, especially relative to the source grd files
(which use UTF-8), and for native readers.  Example delta for Arabic:
	CHROMEVOX_COLUMN_GRANULARITY
-		"\u0639\u0645\u0648\u062f"
+		"عمود"

On a Pixelbook, converting all the messages.json files in this way
shaves off ~1.4MiB.

Bug:  852585 
Change-Id: I1978141717bbd69aec5a72c9d6769c6998419c1b
Reviewed-on: https://chromium-review.googlesource.com/1099881
Reviewed-by: agrieve <agrieve@chromium.org>
Commit-Queue: Mike Frysinger <vapier@chromium.org>
Cr-Commit-Position: refs/heads/master@{#567085}
[modify] https://crrev.com/364c6973e1d6dd7ed27e5d33b833f401d978446f/tools/grit/grit/format/chrome_messages_json.py
[modify] https://crrev.com/364c6973e1d6dd7ed27e5d33b833f401d978446f/tools/grit/grit/format/chrome_messages_json_unittest.py

Project Member

Comment 2 by bugdroid1@chromium.org, Jun 14 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/8d3ba5c3988e897bc65c629b4172a70367c140f5

commit 8d3ba5c3988e897bc65c629b4172a70367c140f5
Author: Mike Frysinger <vapier@chromium.org>
Date: Thu Jun 14 19:55:04 2018

grit: drop Unicode BOM from Chrome messages.json

The BOM was added back in Dec 2012 [1] because it said the CWS wanted
it in its files.  However, I haven't seen this requirement in the last
few years with my own extensions (uploaded plenty w/out BOMs and the
localization still works), and our public docs [2] don't mention it.
Drop the BOM to shrink (slightly) the files and to make it easier
for other JSON parsers to handle this (as not all skip the BOM).

Further, reading the CWS source directly indicates it is not required
and is silently ignored.  See CrxMessagesParserImpl.java:parse() which
loads the JSON data through ManifestSanitizerUtil.sanitize(), and that
ManifestSanitizerUtil.java:sanitize function simply does:
  // The incoming manifest may have a byte-order marker (0xFEFF) to denote
  // Unicode as its first character.  Although it's not allowed in pure JSON,
  // we remove it here so parse just works
  if (content.charAt(0) == Constants.UNICODE_BYTE_ORDER_MARK) {
    content = content.substring(1);
  }
  return content;

Url-1: https://codereview.chromium.org/11557029
Url-2: https://developer.chrome.com/extensions/i18n-messages
Bug:  852585 
Change-Id: I65f61eecb1147cd4c05f3a7a2295bef85023cb65
Reviewed-on: https://chromium-review.googlesource.com/1101239
Reviewed-by: agrieve <agrieve@chromium.org>
Reviewed-by: Robert Flack <flackr@chromium.org>
Commit-Queue: Mike Frysinger <vapier@chromium.org>
Cr-Commit-Position: refs/heads/master@{#567387}
[modify] https://crrev.com/8d3ba5c3988e897bc65c629b4172a70367c140f5/tools/grit/grit/tool/build.py

Status: Assigned (was: Available)
Project Member

Comment 4 by bugdroid1@chromium.org, Nov 14

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/830ebc996bf084166bc3d5e1ad972f6f55270111

commit 830ebc996bf084166bc3d5e1ad972f6f55270111
Author: Mike Frysinger <vapier@chromium.org>
Date: Wed Nov 14 22:28:18 2018

grit: minify json outputs

This can easily save ~1k if not ~10k+ per message file.  Looking at an
example CrOS image today, this can add up to MB of data in the rootfs.

We don't make this optional for now as it's not clear whether anyone
really cares, and it's easy enough for people to pretty print the json
files after the fact.

Bug:  852585 
Change-Id: I1d2331dfc38d404ea03facd3d0a8845f32c4f981
Reviewed-on: https://chromium-review.googlesource.com/c/1320515
Reviewed-by: Robert Flack <flackr@chromium.org>
Commit-Queue: Mike Frysinger <vapier@chromium.org>
Cr-Commit-Position: refs/heads/master@{#608146}
[modify] https://crrev.com/830ebc996bf084166bc3d5e1ad972f6f55270111/tools/grit/grit/format/chrome_messages_json.py
[modify] https://crrev.com/830ebc996bf084166bc3d5e1ad972f6f55270111/tools/grit/grit/format/chrome_messages_json_unittest.py

Status: Fixed (was: Assigned)
this is largely done.  grit outputs minified messages.json by default now.

Sign in to add a comment