Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Encoding.Default is not the same as no encoding in File.ReadAllText?

(Sorry if this is a dupe)

I've just spent a long time trying to read a text file correctly.

Having started with File.ReadAllText(path) and getting screwed-up characters, I tried several variants of File.ReadAlltext(path, Encoding) after which I got bogged down trying to analyse my input files to work out which byte was the problem, etc.

In desperation I tried File.ReadAllText(path, Encoding.Default), which worked!

I'm now struggling to understand why the default value is apparently only the default value if you specify it.

(My cut-down test string was +4433ç, I saved it in notepad as ANSI - though with Swiss French regional settings...)

like image 396
Benjol Avatar asked Aug 20 '09 11:08

Benjol


1 Answers

Encoding.Default is the system's ANSI codepage.

What File.ReadAllText does if you don't specify an encoding is this:

  • First it checks whether there's a byte order mark (UTF-8, UTF-16 or UTF-32). If there is, it uses the encoding specified in the byte order mark.
  • Otherwise, it uses UTF-8.

So the only way to get the system's ANSI codepage is to explicitly specify Encoding.Default.

like image 138
Daniel Avatar answered Sep 22 '22 08:09

Daniel