What are some common character encodings that a text editor should support?

Question

I have a text editor that can load ASCII and Unicode files. It automatically detects the encoding by looking for the BOM at the beginning of the file and/or searching the first 256 bytes for characters > 0x7f.

What other encodings should be supported, and what characteristics would make that encoding easy to auto-detect?

Steve Emmerson · Accepted Answer

Definitely UTF-8. See http://www.joelonsoftware.com/articles/Unicode.html.

As far as I know, there's no guaranteed way to detect this automatically (although the probability of a mistaken diagnosis can be reduced to a very small amount by scanning).

mletterle · Answer

I don't know about encodings, but make sure it can support the multiple different line ending standards! ( vs )

If you haven't checked out Mich Kaplan's blog yet, I suggest doing so: http://blogs.msdn.com/michkap/

Specifically this article may be useful: http://www.siao2.com/2007/04/22/2239345.aspx

What are some common character encodings that a text editor should support?

Tags:

encoding

unicode

text-editor

character

Nathan Osman

2 Answers

Steve Emmerson

mletterle

Recent Activity

Donate For Us

What are some common character encodings that a text editor should support?

Tags:

encoding

unicode

text-editor

character

Nathan Osman

2 Answers

Steve Emmerson

mletterle

Related questions

Recent Activity

Donate For Us