What should be the default encoding for an API which reads from an URL using the file: protocol?

Question

I'm designing an API which takes an URL as an input, and reads the content at that URL. When the URL is a "file:" protocol, what would make a better default for the character encoding?

the system's native encoding
UTF-8

The API allows this to be set explicitly. Also, there are a few heuristics we can use to determine the character encoding, like the BOM if available, but when all of these fail, what should be the default?

As far as I can tell, the standards are silent on this issue. All else being equal, I want the right thing to happen most often for someone who doesn't even know there is such a thing as character encoding.

Dave Kerr · Accepted Answer

Always use UTF-8 if possible, and document this in your API documentation. UTF-8 is a rock solid standard for encoding and very future proof - I would avoid generating potential work for yourself by supporting other encodings - also UTF-8 will be easy to use if you migrate the API to be used in such a way that it can be accessed via a Web Service.

What should be the default encoding for an API which reads from an URL using the file: protocol?

Tags:

file

url

character-encoding

api-design

Matthew Simoneau

1 Answers

Dave Kerr

Recent Activity

Donate For Us

What should be the default encoding for an API which reads from an URL using the file: protocol?

Tags:

file

url

character-encoding

api-design

Matthew Simoneau

1 Answers

Dave Kerr

Related questions

Recent Activity

Donate For Us