Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading UTF-8 with BOM in ruby 2.5.0

Is there a way to read files encoded in UTF-8 with BOM (Byte order marks) on Ruby v2.5.0?

On Ruby 2.3.1 this used to work:

 csv = CSV.open(file_path, encoding: 'bom|utf-8')

However, on 2.5.0 the following error ocurrs:

  ArgumentError:
    unknown encoding name - bom|utf-8 
like image 490
romeu.hcf Avatar asked Feb 19 '18 20:02

romeu.hcf


People also ask

Should you use UTF-8 with BOM?

The Unicode Standard permits the BOM in UTF-8, but does not require or recommend its use. Byte order has no meaning in UTF-8, so its only use in UTF-8 is to signal at the start that the text stream is encoded in UTF-8, or that it was converted to UTF-8 from a stream that contained an optional BOM.

How do I add UTF-8 to BOM?

To Add BOM to a UTF-8 file, we can directly write Unicode \ufeff or three bytes 0xEF , 0xBB , 0xBF at the beginning of the UTF-8 file. The Unicode \ufeff represents 0xEF , 0xBB , 0xBF , read this. 1.1 The below example, write a BOM to a UTF-8 file /home/mkyong/file. txt .

How do I save a UTF-8 file as BOM?

1 Answer. Select “Save As” from File menu, go to Save button and open its dropdown menu, select “Save with Encoding…”, choose “Unicode (UTF-8 without signature)”.


1 Answers

You can try this as well:

File.open(file_path, "r:bom|utf-8")

like image 108
vdz Avatar answered Oct 12 '22 16:10

vdz