In ruby 1.9.x, we can specify the encoding with File.open('filename','r:iso-8859-1')
. I often prefer to use a one-line File.read() if I am reading many short files into strings directly. Is there a way I can specify the encoding directly, or do I have to resort to one of the following?
str = File.read('filename') str.force_encoding('iso-8859-1')
or
f = File.open('filename', 'r:iso-8859-1') s = '' while (line = f.gets) s += line end f.close
In Ruby, texts are encoded in UTF-8 by default. This is because UTF-8 is a multi-byte character encoding that allows a single character to take up between 1 and 4 bytes. Other encodings, such as UTF-7, UCS-2, UTF-16, etc., are also present.
Ruby allows the following open modes: "r" Read-only, starts at beginning of file (default mode). "r+" Read-write, starts at beginning of file. "w" Write-only, truncates existing file to zero length or creates a new file for writing.
Use File#readlines to Read Lines of a File in Ruby File#readlines takes a filename to read and returns an array of lines. Newline character \n may be included in each line. We must be cautious when working with a large file, File#readlines will read all lines at once and load them into memory.
From the fine manual:
read(name, [length [, offset]], open_args) → string
Opens the file, optionally seeks to the given
offset
, then returnslength
bytes (defaulting to the rest of the file).read
ensures the file is closed before returning.If the last argument is a hash, it specifies option for internal open().
So you can say things like this:
s = File.read('pancakes', :encoding => 'iso-8859-1') s.encoding #<Encoding:ISO-8859-1>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With