Is there a way to remove the BOM from a UTF-8 encoded file? I know that all of my JSON files are encoded in UTF-8, but the data entry person who edited the JSON files saved it as UTF-8 with the BOM. When I run my Ruby scripts to parse the JSON, it is failing with an error. I don't want to manually open 58+ JSON files and convert to UTF-8 without the BOM.

With ruby >= 1.9.2 you can use the mode <code>r:bom|utf-8</code> This should work (I haven't test it in combination with json): <pre class="prettyprint"><code>json = nil #define the variable outside the block to keep the data File.open('file.txt', "r:bom|utf-8"){|file| json = JSON.parse(file.read) } </code></pre> It doesn't matter, if the BOM is available in the file or not. <hr> Andrew remarked, that <code>File#rewind</code> can't be used with BOM. If you need a rewind-function you must remember the position and replace <code>rewind</code> with <code>pos=</code>: <pre class="prettyprint"><code>#Prepare test file File.open('file.txt', "w:utf-8"){|f| f << "\xEF\xBB\xBF" #add BOM f << 'some content' } #Read file and skip BOM if available File.open('file.txt', "r:bom|utf-8"){|f| pos =f.pos p content = f.read #read and write file content f.pos = pos #f.rewind goes to pos 0 p content = f.read #(re)read and write file content } </code></pre>

Is there a way to remove the BOM from a UTF-8 encoded file?

1 Answers

With ruby >= 1.9.2 you can use the mode r:bom|utf-8

This should work (I haven't test it in combination with json):

json = nil #define the variable outside the block to keep the data File.open('file.txt', "r:bom|utf-8"){|file|   json = JSON.parse(file.read) }

It doesn't matter, if the BOM is available in the file or not.

Andrew remarked, that File#rewind can't be used with BOM.

If you need a rewind-function you must remember the position and replace rewind with pos=:

#Prepare test file File.open('file.txt', "w:utf-8"){|f|   f << "\xEF\xBB\xBF" #add BOM   f << 'some content' }  #Read file and skip BOM if available File.open('file.txt', "r:bom|utf-8"){|f|   pos =f.pos   p content = f.read  #read and write file content   f.pos = pos   #f.rewind  goes to pos 0   p content = f.read  #(re)read and write file content }

answered Sep 30 '22 12:09

knut

Related questions
                            
                                Cumulative array sum in Ruby
                            
                                How do I know what iteration I'm in when using the Integer.times method?
                            
                                Ruby -v dyld: Library not loaded: /usr/local/lib/libgmp.10.dylib
                            
                                Does Ruby support unicode and how does it work?
                            
                                Ruby - Get request body from incoming http call
                            
                                bundle install doesn't work from capistrano
                            
                                How to silence Rubocop warning on Assignment Branch Condition?
                            
                                How can I select the longest string from a Ruby array?
                            
                                vim ruby mismatch on Mac High Sierra
                            
                                File.open with block vs without
                            
                                Unable to update gems on production server
                            
                                Rails before_filter for specific actions in controller
                            
                                How to read lines from file into array?
                            
                                How to run a Ruby script using rbenv with cron
                            
                                Select all elements from one column in an array of arrays in Ruby?
                            
                                Including one erb file into another
                            
                                Ruby unable to parse a CSV file: CSV::MalformedCSVError (Illegal quoting in line 1.)
                            
                                Calling/applying lambda vs. function call - the syntax in Ruby is different. Why?
                            
                                What are the advantages of Mocha over RSpec's built in mocking framework? [closed]
                            
                                Ruby: How to chain multiple method calls together with "send"

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Is there a way to remove the BOM from a UTF-8 encoded file?

Tags:

ruby

byte-order-mark

Abe

People also ask

1 Answers

knut

Recent Activity

Donate For Us