Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Batch file encoding

I would like to deal with filename containing strange characters, like the French é.

Everything is working fine in the shell:

C:\somedir\>ren -hélice hélice 

I know if I put this line in a .bat file, I obtain the following result:

C:\somedir\>ren -hÚlice hÚlice 

See ? é have been replaced by Ú.

The same is true for command output. If I dir some directory in the shell, the output is fine. If I redirect this output to a file, some characters are transformed.

So how can I tell cmd.exe how to interpret what appears as an é in my batch file, is really an é and not a Ú or a comma?

So there is no way when executing a .bat file to give an hint about the codepage in which it was written?

like image 226
shodanex Avatar asked Sep 15 '09 15:09

shodanex


People also ask

What encoding do batch files use?

Command line internal encoding (that changed with chcp) . bat Text Encoding.

What does %% mean in batch?

Use double percent signs ( %% ) to carry out the for command within a batch file. Variables are case sensitive, and they must be represented with an alphabetical value such as %a, %b, or %c. ( <set> ) Required. Specifies one or more files, directories, or text strings, or a range of values on which to run the command.

Can we encrypt a batch file?

Unlike an executable file, a batch file can be opened in any text editor, making it easy to copy or modify. To protect the contents of your batch file, you must encrypt it using the native Windows 7 Encrypting File System.


2 Answers

You have to save the batch file with OEM encoding. How to do this varies depending on your text editor. The encoding used in that case varies as well. For Western cultures it's usually CP850.

Batch files and encoding are really two things that don't particularly like each other. You'll notice that Unicode is also impossible to use there, unfortunately (even though environment variables handle it fine).

Alternatively, you can set the console to use another codepage:

chcp 1252 

should do the trick. At least it worked for me here.

When you do output redirection, such as with dir, the same rules apply. The console window's codepage is used. You can use the /u switch to cmd.exe to force Unicode output redirection, which causes the resulting files to be in UTF-16.

As for encodings and code pages in cmd.exe in general, also see this question:

  • What encoding/code page is cmd.exe using

EDIT: As for your edit: No, cmd always assumes the batch file to be written in the console default codepage. However, you can easily include a chcp at the start of the batch:

chcp 1252>NUL ren -hélice hélice 

To make this more robust when used directly from the commandline, you may want to memorize the old code page and restore it afterwards:

@echo off for /f "tokens=2 delims=:." %%x in ('chcp') do set cp=%%x chcp 1252>nul ren -hélice hélice chcp %cp%>nul 
like image 108
Joey Avatar answered Oct 13 '22 02:10

Joey


I was having trouble with this, and here is the solution I found. Find the decimal number for the character you are looking for in your current code page.

For example, I'm in codepage 437 (chcp tells you), and I want a degree sign, . http://en.wikipedia.org/wiki/Code_page_437 tells me that the degree sign is number 248.

Then you find the Unicode character with the same number.

The Unicode character at 248 (U+00F8) is .

If you insert the Unicode character in your batch script, it will display to the console as the character you desire.

So my batch file

echo 

prints

° 
like image 37
dconman Avatar answered Oct 13 '22 01:10

dconman