Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I commit with a utf-8 message file?

I am trying to commit a revision with subversion on cmd.exe. The cmd.exe's codepage is utf-8 (set with chcp 65001):

c:\path\to\work\dir> svn ci

Since I have not specified a message with the -m flag, and the variable SVN_EDITOR is set to gvim, gvim opens and I can enter my message. I save the file as utf-8 (:set filencoding=utf8) and quit the editor.

Now, the svn client (?) tells me: Auf ... .folgte ein nicht-ASCII Byte 195, das nicht von/nach UTF-8 konvertiert werden konnte (which I believe in English to be: Non-ASCII character (code %d) detected, and unable to convert to/from UTF-8).

This is strange since I am quite sure that the message file I stored is in UTF-8 format.

I also tried storing it in latin-1, but with the same effect.

Edit

I did a test with the message ü. The hex content of the file is

0000000: c3bc 0d0a 2d2d 2044 6965 7365 2075 6e64  ....-- Diese und
0000010: 2064 6965 2066 6f6c 6765 6e64 656e 205a   die folgenden Z
0000020: 6569 6c65 6e20 7765 7264 656e 2069 676e  eilen werden ign
0000030: 6f72 6965 7274 202d 2d0d 0a0d 0a41 2020  oriert --....A
0000040: 2020 780d 0a                               x..

Note the first for characters (ü followed by \x0d\x0a). The ü is encoded as c3 bc which is the utf-8 representation for LATIN SMALL LETTER U WITH DIAERESIS (see utf 8 table) which is the desired ü.

Note also, that the error message (in this new case: Ein Nicht-ASCII Zeichen (Kode 195) wurde gefunden, das nicht von/nach UTF-8 konvertiert werden konnte) complains about 195 (which is decimal for c3, the very first byte in the file). Of course, the error message is right: it is no ASCII character, but is this not the whole point of using utf-8 files?

Edit 2

I tried to commit the message in UTF-8 format because this was the what I believed to be most natural thing. Obviously, SVN, at least on cmd.exe, doesn't think so. I couldn't care less what format I need to commit the message in, as long as I can commit an ü and other german special characters.

like image 521
René Nyffenegger Avatar asked Nov 07 '13 10:11

René Nyffenegger


1 Answers

It looks like the svn commit command actually accepts an argument to tell SVN what encoding your commit message is in. Try svn commit --encoding UTF-8.

http://svnbook.red-bean.com/en/1.7/svn.ref.svn.html says:

--encoding ENC

Tells Subversion that your commit message is composed using the character encoding provided. The default character encoding is derived from your operating system's native locale; use this option if your commit message is composed using any other encoding.

like image 78
Ben Avatar answered Sep 29 '22 06:09

Ben