Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tool to convert code source from a codepage to UTF-8?

I'm working on an open source project. The original project contains comments in russian and is using codepage 1251. I'm using codepage 1252 and the russian comments aren't displayed correctly in Visual Studio Express 2008, not nice but anyway I can't read russian. Someone using codepage 950 (traditional chinese) tried to compile the project and was unable to do it, because of the code page! Now it is really annoying.

I think that using unicode (and more exactly UTF-8 with signature) as file format for the code source is the way to go.

Problem: how to convert the whole source code easily?

I have already though about:

  • Let Visual Studio save the source code as UTF-8. But: My computer is using codepage 1252 and I found no way to tell VS that the original code source is using codepage 1251 so that the conversion won't be correct.

    Edit: As pointed by "LicenseQ" there is a way to open a single file in VS with another encoding: click Arrow near Open button in open dialog, chose "Open With" and then chose "Code Editor (with encoding)".

  • Of course I could change the codepage of my computer for the time of the conversion. But it's a global setting in Windows and you need to reboot the computer so that I'm looking for a more friendly solution.

  • I've found a tool called CodePageConverter which do exactly what I need, but can't a do it as batch job.

Does anyone know another tool (a command line tool would be perfect) to convert from a codepage to UTF-8?

Edit: As suggest by tkotitan seems iconv to be the solution I was looking for. There is a windows version of iconv. And now that I know the name of this tool, I was able to find over posts on stackoverflow dealing with analogous issues.

like image 424
Name Avatar asked Feb 06 '09 19:02

Name


People also ask

How do I convert a file to UTF-8?

Click Tools, then select Web options. Go to the Encoding tab. In the dropdown for Save this document as: choose Unicode (UTF-8). Click Ok.

How do I change my encoding to UTF-8?

UTF-8 Encoding in Notepad (Windows)Click File in the top-left corner of your screen. In the dialog which appears, select the following options: In the "Save as type" drop-down, select All Files. In the "Encoding" drop-down, select UTF-8.

Is UTF-8 a codepage?

UTF-8 is the universal code page for internationalization and is able to encode the entire Unicode character set. It is used pervasively on the web, and is the default for *nix-based platforms.

How do I convert an image to UTF-8?

It can be done in NotePad++ easily: Open binary image file into Notepad++ Select all text (ex: Ctrl-A) Go to Plugins->MIMETools->Base64 Encode.


1 Answers

In a unix world the utility is called iconv.

Not sure if there is a windows equivalent.

like image 109
tkotitan Avatar answered Oct 15 '22 09:10

tkotitan