I typed in
file -I*
to look at all the encoding of all the CSV files in an entire directory. A lot of the file encodings are charset=binary. I'm not too familiar with this encoding format.
Does anyone know how to handle this encoding?
Thanks a lot for your time.
You can determine a files encoding and character set through the command line in Mac OS (and linux) by using the “file” command, which helps to retrieve general and specific information about a file type.
In poking around to try to figure out a better method to find out if a file is UTF-8 or not, I discovered just the command I needed: isutf8 . Yes, the name of the command is “is UTF8” all crammed together & lowercased, which certainly makes it easy to remember.
What file is telling you with charset=binary is that it doesn't have any more specific information than that the file contains bits and bytes (Capt'n Obvious to the rescue). It's up to you to interpret the file in the correct encoding/interpret it as the correct file format. Follow this answer to receive notifications.
"Binary" encoding pretty much means that the encoding is unknown.
Everything is binary data under the hood. In text files each byte, or sequence of bytes, represents a specific character, and which character in particular depends on the encoding the file was encoded with/you're interpreting the file with. Some encodings are unambiguously recognisable, others aren't (e.g. any file is valid in any single-byte encoding, you can't easily distinguish one single-byte encoding from another). What file
is telling you with charset=binary
is that it doesn't have any more specific information than that the file contains bits and bytes (Capt'n Obvious to the rescue). It's up to you to interpret the file in the correct encoding/interpret it as the correct file format.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With