How to find out the Encoding of a File? C#

Tags:

c#

Well i need to find out which of the files i found in some directory is UTF8 Encoded either ANSI encoded to change the Encoding in something else i decide later. My problem is.. how can i find out if a file is UTF8 or ANSI Encoded? Both of the encodings are actually posible in my files.

597

asked Aug 04 '10 09:08

darkdog

1 Answers

There is no reliable way to do it (since the file might be just random binary), however the process done by Windows Notepad software is detailed in Micheal S Kaplan's blog:

http://www.siao2.com/2007/04/22/2239345.aspx

Check the first two bytes; 1. If there is a UTF-16 LE BOM, then treat it (and load it) as a "Unicode" file; 2. If there is a UTF-16 BE BOM, then treat it (and load it) as a "Unicode (Big Endian)" file; 3. If the first two bytes look like the start of a UTF-8 BOM, then check the next byte and if we have a UTF-8 BOM, then treat it (and load it) as a "UTF-8" file;

Check with IsTextUnicode to see if that function think it is BOM-less UTF-16 LE, if so, then treat it (and load it) as a "Unicode" file;

Check to see if it UTF-8 using the original RFC 2279 definition from 1998 and if it then treat it (and load it) as a "UTF-8" file;

Assume an ANSI file using the default system code page of the machine.

Now note that there are some holes here, like the fact that step 2 does not do quite as good with BOM-less UTF-16 BE (there may even be a bug here, I'm not sure -- if so it's a bug in Notepad beyond any bug in IsTextUnicode).

162

answered Sep 29 '22 14:09

sukru

Related questions
                            
                                Mixed authentication for OWIN
                            
                                Linq: GroupBy vs Distinct
                            
                                Approximating an ellipse with a polygon
                            
                                Using Startup class in ASP.NET5 Console Application
                            
                                Why does this nested object initializer throw a null reference exception?
                            
                                How to suppress code analysis messages for all type members?
                            
                                Why does 'Any CPU (prefer 32-bit)' allow me to allocate more memory than x86 under .NET 4.5?
                            
                                App redirects to Account/AccessDenied on adding Oauth
                            
                                Does .NET Task.Result block(synchronously) a thread [duplicate]
                            
                                "PDB format is not supported" with .NET portable debugging information
                            
                                ASP.NET Core 2.2 WebAPI 405 Method Not Allowed
                            
                                Home Automation Library [closed]
                            
                                Can you Pass Func<T,bool> Through a WCF Service?
                            
                                Lock Windows workstation programmatically in C#
                            
                                Snapshot History With Entity Framework
                            
                                How to find control points for a BezierSegment given Start, End, and 2 Intersection Pts in C# - AKA Cubic Bezier 4-point Interpolation
                            
                                Best way to detect similar email addresses?
                            
                                Is there a way to edit a pdf with C#? [closed]
                            
                                Building An App With Plug-in Support
                            
                                Lock file for writing/deleting while allowing any process to read

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With