Remove multiple BOMs from a file

Tags:

byte-order-mark

I am using a Javascript file that is a concatenation of other JavaScript files.

Unfortunately, the person who concatenated these JavaScript files together did not use the proper encoding when reading the file, and allowed a BOM for every single JavaScript file to get written to the concatenated JavaScript file.

Does anyone know a simple way to search through the concatenated file and remove any/all BOM markers?

Using PHP or a bash script for Mac OSX would be great.

758

asked Feb 01 '12 17:02

Macy Abbey

2 Answers

fetch BOM files

grep -rIlo $’^\xEF\xBB\xBF’ ./

remove BOM files

grep -rIlo $’^\xEF\xBB\xBF’ . | xargs sed –in-place -e ‘s/\xef\xbb\xbf//’

exclude .svn dir

grep -rIlo –exclude-dir=”.svn” $’^\xEF\xBB\xBF’ . | xargs sed –in-place -e ‘s/\xef\xbb\xbf//’

See more at: http://www.a5go.com/how-to-remove-bom-from-utf-8-using-sed.html#

175

answered Sep 18 '22 15:09

user1896665

I normally do it using vim:

vim -c "set nobomb" -c wq! myfile

answered Sep 21 '22 15:09

Pål Brattberg

Related questions
                            
                                Windows Unicode C++ Stream Output Failure
                            
                                How to decode a numpy array of encoded literals/strings in Python3? AttributeError: 'numpy.ndarray' object has no attribute 'decode'
                            
                                Delphi2010: Writing code to assign Caption containing Unicode literal values or load unicode symbols from text file?
                            
                                Issue about 65533 � in C# text file reading
                            
                                Can't get Czech characters while generating a PDF
                            
                                Unicode identifiers (function names) for non-localization purposes advisable?
                            
                                opencsv CSVWriter using utf-8 doesn't seem to work for multiple languages
                            
                                Strange unicode characters when reading in file in node.js app
                            
                                Reference: Why are my "special" Unicode characters encoded weird using json_encode?
                            
                                Input unicode string with pyautogui
                            
                                PHP function imagettftext() and unicode
                            
                                Writing utf16 to file in binary mode
                            
                                How do I use unicode (UTF-8) characters in Clojure regular expressions?
                            
                                Cyrillic alphabet validation
                            
                                Can the French and Spanish special chars be held in a varchar?
                            
                                How to Resize Unicode Characters via CSS
                            
                                How can I get Mocha's Unicode output to display properly in a Windows console?
                            
                                Using middle-dot ASCII with proper support?
                            
                                How to remove accent in Python 3.5 and get a string with unicodedata or other solutions?
                            
                                Python - pyparsing unicode characters

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With