New issue
Advanced search Search tips

Issue 651204 link

Starred by 2 users

Issue metadata

Status: Duplicate
Merged: issue 659528
Owner:
Closed: Oct 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 3
Type: Bug



Sign in to add a comment

clang_format.py fails on utf-8 files on windows

Project Member Reported by scottmg@chromium.org, Sep 28 2016

Issue description

I'd been using an old hacked local up clang_format.py for a long time, but recently switched to the buildtools one on a new machine. On files with non-ascii I get

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "c:/src/cr/src/buildtools/clang_format/script/clang-format.py", line 110, in <module>
    main()
  File "c:/src/cr/src/buildtools/clang_format/script/clang-format.py", line 87, in main
    stdout, stderr = p.communicate(input=text.encode(encoding))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 6632: ordinal not in range(128)


It hasn't changed upstream https://llvm.org/svn/llvm-project/cfe/trunk/tools/clang-format/clang-format.py

But I think the

  text.encode(encoding)

call should really be

  text.decode(encoding).encode('utf-8')

or something like that as we're going from the vim-buffer's encoding to a python string, and then to an encoding that clang-format wants as input. Does that make sense?

(Not sure if anyone else uses this script... Nico?)
 

Comment 1 by thakis@chromium.org, Sep 28 2016

I use this script on Windows, but I guess I never used it on a non-ansi file.
Ah, OK. Mark is very enthusiastic about using fancy Mac-y punctuation in Crashpad. :)

I guess I'll submit that patch upstream if it doesn't look crazy.


Comment 3 by thakis@chromium.org, Sep 28 2016

Is there code that converts back from utf8 to fileformat after formatting?
Huh, yeah, good point. The load back in is doing decode(encoding), which should actually be decode('utf-8') in that case. But then it just applies diff transforms to the buffer, so I don't think it needs to do any re-encoding the the buffer's format.

Or I'm totally confused, which is also possible.

Comment 5 by thakis@chromium.org, Oct 24 2016

http://llvm.org/viewvc/llvm-project?revision=284988&view=revision landed recently. It touches a different line though, so not sure if it's related.
Status: Fixed (was: Assigned)
Yeah, that fixes it. Thanks. (I guess it's probably not worth rolling the script since no one else noticed, we can just pick it up next time we roll the binary.)

Comment 7 by thakis@chromium.org, Oct 27 2016

Mergedinto: 659528
Status: Duplicate (was: Fixed)
Jochen rolled it in  bug 659528 , so this is now all done.

Sign in to add a comment