New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 808905 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner:
Long OOO (go/where-is-mgiuca)
Closed: Feb 2018
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug



Sign in to add a comment

git hyper-blame output is tabulated badly for non-ASCII characters

Project Member Reported by mgiuca@chromium.org, Feb 5 2018

Issue description

Counts UTF-8 code units for the purpose of string width instead of code points.

What steps will reproduce the problem?
(1) Add a line to a file.
(2) git commit --author="超 Blame 💩"
(3) git hyper-blame <file>, and find the added line.

What is the expected result?
All rows in the table align.

What happens instead?
The line with Unicode characters in the author is shorter than the others.

1bb81a08 (Matt Giuca                 2018-01-29 14:59:16 +1100 1735)           given by the following algorithm. The algorithm takes a
981f9268 (超 Blame 💩             2018-02-05 14:26:43 +1100 1736)         whatever
1bb81a08 (Matt Giuca                 2018-01-29 14:59:16 +1100 1737)           <a>USVString</a> <var>value</var>, a <a>URL</a> <var>manifest

Python 2's shitty Unicode support strikes again. Can we upgrade to Python 3 one day?

This is because git hyper-blame uses len(str) counting the bytes of the string, not the characters.

We could use *graphemes* but in my experience, a terminal won't display graphemes in a single-character box anyway, so the simplest thing is just to count code points.
 
Project Member

Comment 1 by bugdroid1@chromium.org, Feb 12 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/depot_tools/+/a148b5ee55b107b0ad99b0ea69060b6609d97656

commit a148b5ee55b107b0ad99b0ea69060b6609d97656
Author: Matt Giuca <mgiuca@chromium.org>
Date: Mon Feb 12 23:28:47 2018

git hyper-blame: Fix tabulation of Unicode characters in author name.

Previously, it counted the number of UTF-8 bytes when spacing out the
table, not the number of code points.

Bug:  808905 
Change-Id: Ice5504089e0f7097e108c6dfbbb810620b9dfc94
Reviewed-on: https://chromium-review.googlesource.com/901142
Commit-Queue: Matt Giuca <mgiuca@chromium.org>
Reviewed-by: Robbie Iannucci <iannucci@chromium.org>

[modify] https://crrev.com/a148b5ee55b107b0ad99b0ea69060b6609d97656/git_hyper_blame.py
[modify] https://crrev.com/a148b5ee55b107b0ad99b0ea69060b6609d97656/tests/git_hyper_blame_test.py

Comment 2 by mgiuca@chromium.org, Feb 12 2018

Status: Fixed (was: Started)

Sign in to add a comment